Some user needs can only be met by leveraging the capabilities of others to undertake particular tasks. In this paper, we develop a framework, named CROWDSERVICE, which supplies crowd intelligence and labor as publicly accessible crowd services via mobile crowdsourcing. It employs a genetic algorithm to dynamically synthesize and update near-optimal cost and time constraints for each crowd service involved in a composite service, and selects a near-optimal set of workers for each crowd service to be executed. We implement the proposed framework on Android platforms, and evaluate its effectiveness, scalability and usability in both experimental and user studies.
Programming community-based question-answering websites such as Stack Overflow encounter frequently created duplicate questions. To tackle this problem, Stack Overflow provides a mechanism for reputable users to manually mark duplicate questions. This is laborious and leads to many duplicate questions remain undetected. To tackle this issue, we model the duplicate detection as a two-stage ranking-classification" problem over question pairs, in which we leverage ranking algorithms and develop novel features for discriminative classification. Experiments on real-world questions about multiple programming languages demonstrate that our method works very well; in some cases over 25% improvement compared to the state-of-the-art benchmarks.
We address the problem of associating access policies with datasets and how to monitor compliance via policy-carrying data. Our contributions are a formal model in first-order logic inspired by normative multi-agent systems to regulate data access, and a computational model for the validation of specific use cases and the verification of policies against criteria. Existing work on access policy identifies roles as a key enabler, with which we concur, but much of the rest focusses on authentication and authorization technology. Our proposal addresses normative principles from Berners-Lees bill of rights for the internet, with human-readable but machine-processable access control policies.
The knowledge discovery from knowledge bases enables research efforts on efficient querying techniques. Curated knowledge bases that are modelled by RDF built SPARQL endpoints for convenient search engine like querying and call for techniques to enhance the efficiency of the querying. This article introduces a learning based framework to accelerate the overall querying speed on the SPARQL endpoints. The proposed framework acts like a proxy between SPARQL endpoints and clients. By leveraging the querying patterns learned from clients' queries, potential issued queries are identified and prefecth/cached for the purpose of improving overall querying speed.
Public awareness of and concerns about companies' social and environmental impacts have seen a marked increase over recent decades. The quantity of relevant information has increased in parallel, as states pass laws requiring certain forms of reporting. However, this information is typically dispersed and non-standardized, making it complicated to collect and analyse. WikiRate.org platform aims to collect this information and store it in a standardised format within a centralised public repository. This paper introduces easIE, an easy-to-use information extraction framework that leverages general Web information extraction principles for building datasets with Environmental-Social-Governance (ESG) information from the Web.
We study the risk investors face from Bitcoin exchanges, which convert between Bitcoins and hard currency. We examine the track record of 79 Bitcoin exchanges established between 2010 and 2015. We find that nearly half (38) have since closed. 26 exchanges suffered security breaches, 15 of which subsequently closed. Using a proportional hazards model, we find that the availability of two-factor authentication, and to a lesser extent an exchange's transaction volume, influence whether or not an exchange is likely to close. Those exchanges that support two-factor authentication have 80% lower odds of closing than those which do not support it.
Introduction to the Special Issue on Emerging Software Technologies for Internet-Based Systems: Internetware and DevOps
IP (Internet Protocol) version 6 (IPv6) was standardised in 1998 to address the expected runout of IP version 4 (IPv4) addresses. However, the transition from IPv4 to IPv6 has been very slow in many countries. We investigate the state of IPv6 deployment in Australian and Chinese organisations based on a survey of organisations' IT staff. Compared to earlier studies, IPv6 deployment has advanced markedly, but it is still years away for a significant portion of organisations. We provide insights into the deployment problems, arguments for deploying IPv6 and how to speed up the transition which are relevant for many countries.
Nowadays, cloud providers offer a broad catalog of services for the rapid development, provisioning, deployment, and continuous integration of distributed applications in DevOps. However, the existence of a wide spectrum of cloud services has become a challenge, as these vary in performance and pricing models. This work addresses such a challenge, by means of providing the decision support concepts and mechanisms to evaluate different potential distributions of applications spanned among heterogeneous cloud services. We analyze the profitability aspect of an application distribution by defining a utility model for the decision making tasks, which is evaluated using the MediaWiki (Wikipedia) application.
We model the abuse data generation process, using phishing sites across 45,358 hosting providers. We find 84% of the variation in abuse is explained with structural factors alone. We enrich a subset of 105 homogeneous ``statistical twins'' with additional explanatory variables and found abuse is positively associated with the website popularity and with the prevalence of CMSes and negatively associated with price. These factors explain 77% of the remaining variation, questioning premature inferences from raw abuse indicators on security efforts of provider, and suggesting the adoption of similar analysis in all domains where network measurement aims at informing technology policy.
We develop a adversarial-theoretic foundation for how malicious person will explore an enterprise network and how they will attack it, based on the concept of a system vulnerability dependency graph. Based on such a model of the adversary, we develop a mechanism by which the network can be modified by the defender so as to induce deception by placing honey nodes and apparent vulnerabilities into the network so as to minimize the expected impact of the adversarys attacks (according to multiple measures of impact).
While the number of cloud solutions is continuously increasing, the development and operation of large and distributed cloud-based applications is still challenging. A major challenge is the lack of interoperability between the existing cloud solutions, which increases the complexity of maintaining and evolving complex applications potentially deployed across multiple cloud infrastructures and platforms. In this paper, we show how CloudMF leverages upon MDE and support the DevOps ideas to tame this complexity by providing: (i) a domain-specific language for specifying the provisioning and deployment of multi-cloud applications, and (ii) a [email protected] environment for their continuous provisioning, deployment, and adaptation.
In online social networks, user can share a content that may violate the privacy of others. Recent approaches use agreement technologies to enable stakeholders of a post to discuss its privacy configurations. However, agreement should be established over multiple posts. A user can tolerate slight breaches of privacy in return of sharing posts themselves. Therefore, users can help each other preserve their privacy, viewing this as social responsibility. We develop a reciprocity-based negotiation that combines semantic privacy rules with utility functions. We evaluate our approach over multi-agent simulations where agents mimic users based on a user study that was conducted.
This paper develops a method to detect visual differences introduced into web pages when they are rendered in different browsers. To achieve this, we propose an empirical visual similarity metric by mimicking human mechanisms of perception. The Gestalt laws of grouping are translated into a rule set. A block tree is parsed by the rules for similarity calculation. During this translation process, experiments are performed to obtain metrics for a variety of Gestalt features. After a validation experiment, the empirical metric is employed to detect cross-browser differences. Experiments on the popular web pages provide positive results for this methodology.
Multi-agent planning using MA-STRIPS-related models is often motivated by the preservation of private information. Such motivation is not only natural for multi-agent systems, but is one of the main reasons, why multi-agent planning (MAP) problems cannot be solved centrally. Although the motivation is common in the literature, formal treatment of privacy is often missing. In this paper, we present an analysis of two well know algorithms, MAFS and Secure-MAFS, in terms of privacy leakage measure introduced in our recent work, both in general and on a particular example.
Finding the responsible of an unpleasant situation is often difficult, especially in artificial agents societies. SCIFF is a successful formalization of agent societies, including a language to describe rules and protocols, and an abductive proof-procedure for compliance checking. However, identifying the responsible for a violation is not always clear. In this work, a definition of accountability for artificial societies is formalized in SCIFF. Two tools are provided for the designer of interaction protocols: a guideline, in terms of syntactic features that ensure accountability of the protocol, and a software to identify, for a given protocol, if non-accountability issues could arise.
In this paper, we propose a collaborative filtering method based on Tensor Factorization, a generalization of the Matrix Factorization approach, to model the multi-dimensional contextual information. It leads to a more compact model of the data which is naturally suitable for integrating contextual information to make POI recommendations. Based on the model, we further improve the recommendation accuracy by utilizing the internal relations within users and locations to regularize the latent factors. Experimental results on a large real-world dataset demonstrate the effectiveness of our approach.
In this work, we novelly leverage the existing social network in YouTube, where a user subscribes to another users channel to track all his/her uploaded videos. We propose SocialTube that builds the subscribers of one channel into a P2P overlay and also clusters common-interest nodes in a higher level. It also incorporates a prefetching algorithm that prefetches higher-popularity videos. Extensive trace-driven simulation results and PlanetLab real-world experimental results verify the effectiveness of SocialTube at reducing server load and overlay maintenance overhead and at improving QoS for users.
Access control management is one of the issues still hindering the development of decentralized online social networks (DOSNs). In a previous work, we proposed an initial audit based model for access control in DOSNs. In this paper, we focus on optimizing the audit process, and on the privacy issues emerging from records kept for audit purposes. We propose an enhanced audit selection, for which experimental results, on a real OSN dataset, show an improvement of more than 50% compared to the basic model. We also provide an analysis of the related privacy issues, and discuss possible privacy preserving alternatives.
The intrusiveness of Web tracking and the increasing invasiveness of digital advertising have raised serious concerns regarding user privacy and Web usability, leading a substantial chunk of the populace to adopt ad-blocking technologies over the last years. The problem with these technologies, however, is that they disregard the underlying economic model of the Web, which is nowadays in danger. In this paper, we investigate an Internet technology that targets users who are not in general against advertising, accept the trade-off that comes with the "free" content, but ---for privacy concerns--- they wish to exert fine-grained control over tracking.
When card data is exposed in a data breach but has not yet been used to attempt fraud, the overall social costs of that breach depend on whether the financial institutions that issued those cards immediately cancel them and issue new cards or instead wait until fraud is at-tempted. We use a parameterized model and Monte Carlo simulation to compare the cost of reissuing cards to the total expected cost of fraud if cards are not reissued. We find that automatically reissuing cards may have lower social costs than the costs of waiting until fraud is attempted.
Besides poor links among ISPs, GS (Golden shield) blocks international channels in China. To avoid such involvement, a seamless networking method automatically switching to VPN bypass is proposed for offshore business communication bridges. This uses (1) multiple thresholding first derivatives of RTT (Round Trip Time) increase to recognize GS blocks start, (2) absolute threshold RTT value and elapsed time to detect its end. The switching error was 4 out of 159 GS block cases. Over 20 offshore companies continue to use for 3 years. Questionnaires to them proved the method is almost perfect in exploiting motherland application services seamlessly.
Distributed cloud platforms are well suited for serving a geographically diverse user base. However traditional cloud provisioning mechanisms that make local scaling decisions are not well suited for temporal and spatial workload fluctuations seen by modern web applications. In this paper, we propose GeoScale, a system that provides geo-elasticity by combining model-driven proactive and agile reactive provisioning approaches. GeoScale can dynamically provision server capacity at any location based on workload dynamics. We conduct a detailed evaluation of GeoScale on Amazon's distributed cloud, and show up to 40% improvement in the 95th percentile response time when compared to traditional elasticity techniques.
We present a novel semi-supervised anomaly-based IDS technique, namely PCkAD, detecting application level content-based attacks. Its peculiarity is to learn legitimate payloads structure by splitting packets in chunks and determining the within packet distribution of n-grams. This strategy is resistant to evasion techniques as blending. Indeed, we prove that finding the right legitimate content is NP-hard in the presence of chunks. Moreover, it improves the false positive rate for a given detection rate with respect to the case where the spatial information is not considered. Comparison with well-know IDS using n-grams, show that PCkAD achieves state of the art performances.
In this paper, we deal with the complexity of the system to manage the context changeability at runtime. Consequently, we proposed four maturity levels associated to a set of design patterns that diminish the system design complexity through selecting the combination of the management processes based on the system requirements. We detailed the autonomic cognitive management pattern which represents the most mature level able to coordinate the system processes based on context changeability and dynamically discover new processes to deal with new requirements. We applied the proposed pattern with a use case from the healthcare domain.
We exploit Decision Networks (DN) for the analysis of attack scenarios. DN can naturally address uncertainty at every level, including the interaction level of attacks and countermeasures, making possible the modeling of more real-world situations which are not limited to Boolean combinations of events; furthermore, inference algorithms can be directly exploited for implementing a probabilistic analysis of both the risk and the importance of the attacks (with respect to specific sets of countermeasures), as well as a sound decision theoretic analysis having the goal of selecting the optimal (with respect to a specific objective function) set of countermeasures.
Mobile cloud computing is emerging as a promising approach to enrich user experiences. The computation offloading decision making and tasks scheduling among heterogeneous shared resources in mobile clouds are becoming challenging problems. We address these two problems together as an optimization problem and propose a context-aware mixed integer programming model to provide offline optimal solutions for making offloading decisions and scheduling offloaded tasks among shared computing resources in heterogeneous mobile clouds. The objective is to minimize the global task completion time. we further propose an online algorithm OCOS algorithm based on the rent/buy problem and prove the algorithm is 2-competitive.
Conventional private data publication schemes are targeted at publication of sensitive datsets. Typically these schemes are designed with the objective of retaining as much utility as possible. Such an approach is inapplicable when users have different levels of access to the same data. In this paper, we present an anonymization framework for publishing large datasets with the goals of providing different levels of utility based on access privilege levels. Our experiments on large association graphs show that the proposed techniques are effective, scalable and yield the required level of privacy and utility for each user privacy and access privilege levels.
In a network, the risk of security compromises depends not only on each node's security, but also on the network structure. Understanding the likelihood of catastrophic security events is necessary for the success of diverse risk-management approaches, including cyber-insurance. However, previous network-security research has not considered features of these distributions beyond their first central moments, while previous cyber-insurance research has not considered the effect of topologies on the supply side. To bridge this gap, we provide a mathematical basis for the assessment of systematic risk in networks, and we perform a numerical study of scale-free networks that model real-world networks.
In this paper, we propose a software framework, called SARIoT for scalable and real-time provisioning of cloud-based IoT services and their data, driven by their contextual properties. The main idea behind the proposed framework is to structure the description of data-centric IoT services and their real-time and historical data in a hierarchical form in accordance with the end-user application's context model.
Social Commitments (SCs) provide a flexible, norm-based, governance structure for sharing and receiving data. However, users of data sharing applications can subscribe to multiple SCs, possibly producing opposing sharing and receiving requirements. We propose resolving such conflicts automatically through a conflict resolution model based on relevant user values such as privacy and safety. The model predicts a user's preferred resolution by choosing the commitment that best supports the user's values. We show through an empirical user study (n=396) that values, as well as recency and norm type, significantly improve a system's ability to predict user preference in location-sharing conflicts.