转载

arXiv Paper Daily: Mon, 2 Jan 2017

Neural and Evolutionary Computing

Adult Content Recognition from Images Using a Mixture of Convolutional Neural Networks

Mundher Al-Shabi , Tee Connie , Andrew Beng Jin Teoh Subjects : Machine Learning (stat.ML) ; Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

With rapid development of the Internet, the web contents become huge. Most of

the websites are publicly available and anyone can access the contents

everywhere such as workplace, home and even schools. Nev-ertheless, not all the

web contents are appropriate for all users, especially children. An example of

these contents is pornography images which should be restricted to certain age

group. Besides, these images are not safe for work (NSFW) in which employees

should not be seen accessing such contents. Recently, convolutional neural

networks have been successfully applied to many computer vision problems.

Inspired by these successes, we propose a mixture of convolutional neural

networks for adult content recognition. Unlike other works, our method is

formulated on a weighted sum of multiple deep neural network models. The

weights of each CNN models are expressed as a linear regression problem learnt

using Ordinary Least Squares (OLS). Experimental results demonstrate that the

proposed model outperforms both single CNN model and the average sum of CNN

models in adult content recognition.

Computer Vision and Pattern Recognition

A Unified Tensor-based Active Appearance Face Model

Zhen-Hua Feng , Josef Kittler , William Christmas , Xiao-Jun Wu

Comments: 15 pages, 7 figures

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Appearance variations result in many difficulties in face image analysis. To

deal with this challenge, we present a Unified Tensor-based Active Appearance

Model (UT-AAM) for jointly modelling the geometry and texture information of 2D

faces. In contrast with the classical Tensor-based AAM (T-AAM), the proposed

UT-AAM has four advantages: First, for each type of face information, namely

shape and texture, we construct a tensor model capturing all relevant

appearance variations. This unified tensor model contrasts with the

variation-specific models of T-AAM. Second, a strategy for dealing with

self-occluded faces is proposed to obtain consistent shape and texture

representations of faces across large pose variations. Third, our UT-AAM is

capable of constructing the model from an incomplete training dataset, using

tensor completion methods. Last, we use an effective cascaded-regression-based

method for UT-AAM fitting. With these improvements, the utility of UT-AAM in

practice is considerably enhanced in comparison with the classical T-AAM. As an

example, we demonstrate the improvements in training facial landmark detectors

through the use of UT-AAM to synthesise a large number of virtual samples.

Experimental results obtained using the Multi-PIE and 300-W face datasets

demonstrate the merits of the proposed approach.

A Joint Speaker-Listener-Reinforcer Model for Referring Expressions

Licheng Yu , Hao Tan , Mohit Bansal , Tamara L. Berg

Comments: 11 pages, 6 figures

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

Referring expressions are natural language constructions used to identify

particular objects within a scene. In this paper, we propose a unified

framework for the tasks of referring expression comprehension and generation.

Our model is composed of three modules: speaker, listener, and reinforcer. The

speaker generates referring expressions, the listener comprehends referring

expressions, and the reinforcer introduces a reward function to guide sampling

of more discriminative expressions. The listener-speaker modules are trained

jointly in an end-to-end learning framework, allowing the modules to be aware

of one another during learning while also benefiting from the discriminative

reinforcer’s feedback. We demonstrate that this unified framework and training

achieves state-of-the-art results for both comprehension and generation on

three referring expression datasets. Project and demo page:

this https URL

Memory Efficient Multi-Scale Line Detector Architecture for Retinal Blood Vessel Segmentation

Hamza Bendaoudi , Farida Cheriet , J. M. Pierre Langlois

Comments: This paper was accepted and presented at Conference on Design and Architectures for Signal and Image Processing – DASIP 2016

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Hardware Architecture (cs.AR)

This paper presents a memory efficient architecture that implements the

Multi-Scale Line Detector (MSLD) algorithm for real-time retinal blood vessel

detection in fundus images on a Zynq FPGA. This implementation benefits from

the FPGA parallelism to drastically reduce the memory requirements of the MSLD

from two images to a few values. The architecture is optimized in terms of

resource utilization by reusing the computations and optimizing the bit-width.

The throughput is increased by designing fully pipelined functional units. The

architecture is capable of achieving a comparable accuracy to its software

implementation but 70x faster for low resolution images. For high resolution

images, it achieves an acceleration by a factor of 323x.

Feedback Networks

Amir R. Zamir , Te-Lin Wu , Lin Sun , William Shen , Jitendra Malik , Silvio Savarese

Comments: See a video describing the method at this https URL and the website this http URL

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Currently, the most successful learning models in computer vision are based

on learning successive representations followed by a decision layer. This is

usually actualized through feedforward multilayer neural networks, e.g.

ConvNets, where each layer forms one of such successive representations.

However, an alternative that can achieve the same goal is a feedback based

approach in which the representation is formed in an iterative manner based on

a feedback received from previous iteration’s output.

We establish that a feedback based approach has several fundamental

advantages over feedforward: it enables making early predictions at the query

time, its output naturally conforms to a hierarchical structure in the label

space (e.g. a taxonomy), and it provides a new basis for Curriculum Learning.

We observe that feedback networks develop a considerably different

representation compared to feedforward counterparts, in line with the

aforementioned advantages. We put forth a general feedback based learning

architecture with the endpoint results on par or better than existing

feedforward networks with the addition of the above advantages. We also

investigate several mechanisms in feedback architectures (e.g. skip connections

in time) and design choices (e.g. feedback length). We hope this study offers

new perspectives in quest for more natural and practical learning models.

Shape Estimation from Defocus Cue for Microscopy Images via Belief Propagation

Arnav Bhavsar Subjects : Computer Vision and Pattern Recognition (cs.CV)

In recent years, the usefulness of 3D shape estimation is being realized in

microscopic or close-range imaging, as the 3D information can further be used

in various applications. Due to limited depth of field at such small distances,

the defocus blur induced in images can provide information about the 3D shape

of the object. The task of `shape from defocus’ (SFD), involves the problem of

estimating good quality 3D shape estimates from images with depth-dependent

defocus blur. While the research area of SFD is quite well-established, the

approaches have largely demonstrated results on objects with bulk/coarse shape

variation. However, in many cases, objects studied under microscopes often

involve fine/detailed structures, which have not been explicitly considered in

most methods. In addition, given that, in recent years, large data volumes are

typically associated with microscopy related applications, it is also important

for such SFD methods to be efficient. In this work, we provide an indication of

the usefulness of the Belief Propagation (BP) approach in addressing these

concerns for SFD. BP has been known to be an efficient combinatorial

optimization approach, and has been empirically demonstrated to yield good

quality solutions in low-level vision problems such as image restoration,

stereo disparity estimation etc. For exploiting the efficiency of BP in SFD, we

assume local space-invariance of the defocus blur, which enables the

application of BP in a straightforward manner. Even with such an assumption,

the ability of BP to provide good quality solutions while using non-convex

priors, reflects in yielding plausible shape estimates in presence of fine

structures on the objects under microscopy imaging.

Action Recognition Based on Joint Trajectory Maps with Convolutional Neural Networks

Pichao Wang , Wanqing Li , Chuankun Li , Yonghong Hou Subjects : Computer Vision and Pattern Recognition (cs.CV)

Convolutional Neural Networks (ConvNets) have recently shown promising

performance in many computer vision tasks, especially image-based recognition.

How to effectively apply ConvNets to sequence-based data is still an open

problem. This paper proposes an effective yet simple method to represent

spatio-temporal information carried in (3D) skeleton sequences into three (2D)

images by encoding the joint trajectories and their dynamics into color

distribution in the images, referred to as Joint Trajectory Maps (JTM), and

adopts ConvNets to learn the discriminative features for human action

recognition. Such an image-based representation enables us to fine-tune

existing ConvNets models for the classification of skeleton sequences without

training the networks afresh. The three JTMs are generated in three orthogonal

planes and provide complimentary information to each other. The final

recognition is further improved through multiply score fusion of the three

JTMs. The proposed method was evaluated on four public benchmark datasets, the

large NTU RGB+D Dataset, MSRC-12 Kinect Gesture Dataset (MSRC-12), G3D Dataset

and UTD Multimodal Human Action Dataset (UTD-MHAD) and achieved the

state-of-the-art results.

Rotation equivariant vector field networks

Diego Marcos , Michele Volpi , Nikos Komodakis , Devis Tuia Subjects : Computer Vision and Pattern Recognition (cs.CV)

We propose a method to encode rotation equivariance or invariance into

convolutional neural networks (CNNs). Each convolutional filter is applied with

several orientations and returns a vector field that represents the magnitude

and angle of the highest scoring rotation at the given spatial location. To

propagate information about the main orientation of the different features to

each layer in the network, we propose an enriched orientation pooling, i.e. max

and argmax operators over the orientation space, allowing to keep the

dimensionality of the feature maps low and to propagate only useful

information. We name this approach RotEqNet. We apply RotEqNet to three

datasets: first, a rotation invariant classification problem, the MNIST-rot

benchmark, in which we improve over the state-of-the-art results. Then, a

neuron membrane segmentation benchmark, where we show that RotEqNet can be

applied successfully to obtain equivariance to rotation with a simple fully

convolutional architecture. Finally, we improve significantly the

state-of-the-art on the problem of estimating cars’ absolute orientation in

aerial images, a problem where the output is required to be covariant with

respect to the object’s orientation.

Deep Learning Logo Detection with Data Expansion by Synthesising Context

Hang Su , Xiatian Zhu , Shaogang Gong Subjects : Computer Vision and Pattern Recognition (cs.CV)

Logo detection in unconstrained images is challenging, particularly when only

very sparse labelled training images are accessible due to high labelling

costs. In this work, we describe a model training image synthesising method

capable of improving significantly logo detection performance when only a

handful of (e.g., 10) labelled training images captured in realistic context

are available, avoiding extensive manual labelling costs. Specifically, we

design a novel algorithm for generating Synthetic Context Logo (SCL) training

images to increase model robustness against unknown background clutters,

resulting in superior logo detection performance. For benchmarking model

performance, we introduce a new logo detection dataset TopLogo-10 collected

from top 10 most popular clothing/wearable brandname logos captured in rich

visual context. Extensive comparisons show the advantages of our proposed SCL

model over the state-of-the-art alternatives for logo detection using two

real-world logo benchmark datasets: FlickrLogo-32 and our new TopLogo-10.

Adult Content Recognition from Images Using a Mixture of Convolutional Neural Networks

Mundher Al-Shabi , Tee Connie , Andrew Beng Jin Teoh Subjects : Machine Learning (stat.ML) ; Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)

With rapid development of the Internet, the web contents become huge. Most of

the websites are publicly available and anyone can access the contents

everywhere such as workplace, home and even schools. Nev-ertheless, not all the

web contents are appropriate for all users, especially children. An example of

these contents is pornography images which should be restricted to certain age

group. Besides, these images are not safe for work (NSFW) in which employees

should not be seen accessing such contents. Recently, convolutional neural

networks have been successfully applied to many computer vision problems.

Inspired by these successes, we propose a mixture of convolutional neural

networks for adult content recognition. Unlike other works, our method is

formulated on a weighted sum of multiple deep neural network models. The

weights of each CNN models are expressed as a linear regression problem learnt

using Ordinary Least Squares (OLS). Experimental results demonstrate that the

proposed model outperforms both single CNN model and the average sum of CNN

models in adult content recognition.

Automatic labeling of molecular biomarkers of whole slide immunohistochemistry images using fully convolutional networks

Fahime Sheikhzadeh , Martial Guillaud , Rabab K. Ward Subjects : Tissues and Organs (q-bio.TO) ; Computer Vision and Pattern Recognition (cs.CV)

This paper addresses the problem of quantifying biomarkers in multi-stained

tissues, based on color and spatial information. A deep learning based method

that can automatically localize and quantify the cells expressing biomarker(s)

in a whole slide image is proposed. The deep learning network is a fully

convolutional network (FCN) whose input is the true RGB color image of a tissue

and output is a map of the different biomarkers. The FCN relies on a

convolutional neural network (CNN) that classifies each cell separately

according to the biomarker it expresses. In this study, images of

immunohistochemistry (IHC) stained slides were collected and used. More than

4,500 RGB images of cells were manually labeled based on the expressing

biomarkers. The labeled cell images were used to train the CNN (obtaining an

accuracy of 92% in a test set). The trained CNN is then extended to an FCN that

generates a map of all biomarkers in the whole slide image acquired by the

scanner (instead of classifying every cell image). To evaluate our method, we

manually labeled all nuclei expressing different biomarkers in two whole slide

images and used theses as the ground truth. Our proposed method for

immunohistochemical analysis compares well with the manual labeling by humans

(average F-score of 0.96).

Artificial Intelligence

Fuzzy Constraints Linear Discriminant Analysis

Hamid Reza Hassanzadeh , Hadi Sadoghi Yazdi , Abedin Vahedian

Journal-ref: 3rd Iranian Joint Congress on Intelligent Systems and Fuzzy

Systems, 2009

Subjects

:

Artificial Intelligence (cs.AI)

In this paper we introduce a fuzzy constraint linear discriminant analysis

(FC-LDA). The FC-LDA tries to minimize misclassification error based on

modified perceptron criterion that benefits handling the uncertainty near the

decision boundary by means of a fuzzy linear programming approach with fuzzy

resources. The method proposed has low computational complexity because of its

linear characteristics and the ability to deal with noisy data with different

degrees of tolerance. Obtained results verify the success of the algorithm when

dealing with different problems. Comparing FC-LDA and LDA shows superiority in

classification task.

PrASP Report

Matthias Nickles

Comments: Technical Report

Subjects

:

Artificial Intelligence (cs.AI)

This technical report describes the usage, syntax, semantics and core

algorithms of the probabilistic inductive logic programming framework PrASP.

PrASP is a research software which integrates non-monotonic reasoning based on

Answer Set Programming (ASP), probabilistic inference and parameter learning.

In contrast to traditional approaches to Probabilistic (Inductive) Logic

Programming, our framework imposes only little restrictions on probabilistic

logic programs. In particular, PrASP allows for ASP as well as First-Order

Logic syntax, and for the annotation of formulas with point probabilities as

well as interval probabilities. A range of widely configurable inference

algorithms can be combined in a pipeline-like fashion, in order to cover a

variety of use cases.

Curiosity-Aware Bargaining

Cédric Buron (LIP6), Sylvain Ductor (LIP6), Zahia Guessoum (LIP6, CRESTIC)

Journal-ref: STAIRS 2016, Aug 2016, The Hague, Netherlands. IOS Press, 284, pp.

27 — 38, Frontiers in Artificial Intelligence and Applications

Subjects

:

Artificial Intelligence (cs.AI)

; Multiagent Systems (cs.MA)

Opponent modeling consists in modeling the strategy or preferences of an

agent thanks to the data it provides. In the context of automated negotiation

and with machine learning, it can result in an advantage so overwhelming that

it may restrain some casual agents to be part of the bargaining process. We

qualify as “curious” an agent driven by the desire of negotiating in order to

collect information and improve its opponent model. However, neither

curiosity-based rational-ity nor curiosity-robust protocol have been studied in

automatic negotiation. In this paper, we rely on mechanism design to propose

three extensions of the standard bargaining protocol that limit information

leak. Those extensions are supported by an enhanced rationality model, that

considers the exchanged information. Also, they are theoretically analyzed and

experimentally evaluated.

When the map is better than the territory

Erik P Hoel

Comments: 15 pages, 6 figures

Subjects

:

Information Theory (cs.IT)

; Artificial Intelligence (cs.AI)

The causal structure of any system can be analyzed at a multitude of spatial

and temporal scales. It has long been thought that while higher scale (macro)

descriptions of causal structure may be useful to observers, they are at best a

compressed description and at worse leave out critical information. However,

recent research applying information theory to causal analysis has shown that

the causal structure of some systems can actually come into focus (be more

informative) at a macroscale (Hoel et al. 2013). That is, a macro model of a

system (a map) can be more informative than a fully detailed model of the

system (the territory). This has been called causal emergence. While causal

emergence may at first glance seem counterintuitive, this paper grounds the

phenomenon in a classic concept from information theory: Shannon’s discovery of

the channel capacity. I argue that systems have a particular causal capacity,

and that different causal models of those systems take advantage of that

capacity to various degrees. For some systems, only macroscale causal models

use the full causal capacity. Such macroscale causal models can either be

coarse-grains, or may leave variables and states out of the model (exogenous)

in various ways, which can improve the model’s efficacy and its informativeness

via the same mathematical principles of how error-correcting codes take

advantage of an information channel’s capacity. As model choice increase, the

causal capacity of a system approaches the channel capacity. Ultimately, this

provides a general framework for understanding how the causal structure of some

systems cannot be fully captured by even the most detailed microscopic model.

A Joint Speaker-Listener-Reinforcer Model for Referring Expressions

Licheng Yu , Hao Tan , Mohit Bansal , Tamara L. Berg

Comments: 11 pages, 6 figures

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

Referring expressions are natural language constructions used to identify

particular objects within a scene. In this paper, we propose a unified

framework for the tasks of referring expression comprehension and generation.

Our model is composed of three modules: speaker, listener, and reinforcer. The

speaker generates referring expressions, the listener comprehends referring

expressions, and the reinforcer introduces a reward function to guide sampling

of more discriminative expressions. The listener-speaker modules are trained

jointly in an end-to-end learning framework, allowing the modules to be aware

of one another during learning while also benefiting from the discriminative

reinforcer’s feedback. We demonstrate that this unified framework and training

achieves state-of-the-art results for both comprehension and generation on

three referring expression datasets. Project and demo page:

this https URL

Adaptive Lambda Least-Squares Temporal Difference Learning

Timothy A. Mann , Hugo Penedones , Shie Mannor , Todd Hester Subjects : Learning (cs.LG) ; Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

Temporal Difference learning or TD((lambda)) is a fundamental algorithm in

the field of reinforcement learning. However, setting TD’s (lambda) parameter,

which controls the timescale of TD updates, is generally left up to the

practitioner. We formalize the (lambda) selection problem as a bias-variance

trade-off where the solution is the value of (lambda) that leads to the

smallest Mean Squared Value Error (MSVE). To solve this trade-off we suggest

applying Leave-One-Trajectory-Out Cross-Validation (LOTO-CV) to search the

space of (lambda) values. Unfortunately, this approach is too computationally

expensive for most practical applications. For Least Squares TD (LSTD) we show

that LOTO-CV can be implemented efficiently to automatically tune (lambda) and

apply function optimization methods to efficiently search the space of

(lambda) values. The resulting algorithm, ALLSTD, is parameter free and our

experiments demonstrate that ALLSTD is significantly computationally faster

than the na”{i}ve LOTO-CV implementation while achieving similar performance.

Intelligent information extraction based on artificial neural network

Ahlam Ansari , Moonish Maknojia , Altamash Shaikh Subjects : Computation and Language (cs.CL) ; Artificial Intelligence (cs.AI)

Question Answering System (QAS) is used for information retrieval and natural

language processing (NLP) to reduce human effort. There are numerous QAS based

on the user documents present today, but they all are limited to providing

objective answers and process simple questions only. Complex questions cannot

be answered by the existing QAS, as they require interpretation of the current

and old data as well as the question asked by the user. The above limitations

can be overcome by using deep cases and neural network. Hence we propose a

modified QAS in which we create a deep artificial neural network with

associative memory from text documents. The modified QAS processes the contents

of the text document provided to it and find the answer to even complex

questions in the documents.

Information Retrieval

Automatic Data Deformation Analysis on Evolving Folksonomy Driven Environment

Massimiliano Dal Mas

Comments: 8 pages, 3 figures; 2 tables; for details see: this http URL

Subjects

:

Information Retrieval (cs.IR)

; Computation and Language (cs.CL); Computers and Society (cs.CY); Social and Information Networks (cs.SI)

The Folksodriven framework makes it possible for data scientists to define an

ontology environment where searching for buried patterns that have some kind of

predictive power to build predictive models more effectively. It accomplishes

this through an abstractions that isolate parameters of the predictive modeling

process searching for patterns and designing the feature set, too. To reflect

the evolving knowledge, this paper considers ontologies based on folksonomies

according to a new concept structure called “Folksodriven” to represent

folksonomies. So, the studies on the transformational regulation of the

Folksodriven tags are regarded to be important for adaptive folksonomies

classifications in an evolving environment used by Intelligent Systems to

represent the knowledge sharing. Folksodriven tags are used to categorize

salient data points so they can be fed to a machine-learning system and

“featurizing” the data.

PAMPO: using pattern matching and pos-tagging for effective Named Entities recognition in Portuguese

Conceição Rocha , Alípio Jorge , Roberta Sionara , Paula Brito , Carlos Pimenta , Solange Rezende Subjects : Information Retrieval (cs.IR) ; Computation and Language (cs.CL)

This paper deals with the entity extraction task (named entity recognition)

of a text mining process that aims at unveiling non-trivial semantic

structures, such as relationships and interaction between entities or

communities. In this paper we present a simple and efficient named entity

extraction algorithm. The method, named PAMPO (PAttern Matching and POs tagging

based algorithm for NER), relies on flexible pattern matching, part-of-speech

tagging and lexical-based rules. It was developed to process texts written in

Portuguese, however it is potentially applicable to other languages as well.

We compare our approach with current alternatives that support Named Entity

Recognition (NER) for content written in Portuguese. These are Alchemy, Zemanta

and Rembrandt. Evaluation of the efficacy of the entity extraction method on

several texts written in Portuguese indicates a considerable improvement on

(recall) and (F_1) measures.

FolksoDrivenCloud: an annotation and process application for social collaborative networking

Massimiliano Dal Mas

Comments: 9 pages, 3 figures; for details see: this http URL

Subjects

:

Computers and Society (cs.CY)

; Information Retrieval (cs.IR); Software Engineering (cs.SE); Social and Information Networks (cs.SI)

In this paper we present the FolksoDriven Cloud (FDC) built on Cloud and on

Semantic technologies. Cloud computing has emerged in these recent years as the

new paradigm for the provision of on-demand distributed computing resources.

Semantic Web can be used for relationship between different data and

descriptions of services to annotate provenance of repositories on ontologies.

The FDC service is composed of a back-end which submits and monitors the

documents, and a user front-end which allows users to schedule on-demand

operations and to watch the progress of running processes. The impact of the

proposed method is illustrated on a user since its inception.

Computation and Language

Intelligent information extraction based on artificial neural network

Ahlam Ansari , Moonish Maknojia , Altamash Shaikh Subjects : Computation and Language (cs.CL) ; Artificial Intelligence (cs.AI)

Question Answering System (QAS) is used for information retrieval and natural

language processing (NLP) to reduce human effort. There are numerous QAS based

on the user documents present today, but they all are limited to providing

objective answers and process simple questions only. Complex questions cannot

be answered by the existing QAS, as they require interpretation of the current

and old data as well as the question asked by the user. The above limitations

can be overcome by using deep cases and neural network. Hence we propose a

modified QAS in which we create a deep artificial neural network with

associative memory from text documents. The modified QAS processes the contents

of the text document provided to it and find the answer to even complex

questions in the documents.

Automatic Data Deformation Analysis on Evolving Folksonomy Driven Environment

Massimiliano Dal Mas

Comments: 8 pages, 3 figures; 2 tables; for details see: this http URL

Subjects

:

Information Retrieval (cs.IR)

; Computation and Language (cs.CL); Computers and Society (cs.CY); Social and Information Networks (cs.SI)

The Folksodriven framework makes it possible for data scientists to define an

ontology environment where searching for buried patterns that have some kind of

predictive power to build predictive models more effectively. It accomplishes

this through an abstractions that isolate parameters of the predictive modeling

process searching for patterns and designing the feature set, too. To reflect

the evolving knowledge, this paper considers ontologies based on folksonomies

according to a new concept structure called “Folksodriven” to represent

folksonomies. So, the studies on the transformational regulation of the

Folksodriven tags are regarded to be important for adaptive folksonomies

classifications in an evolving environment used by Intelligent Systems to

represent the knowledge sharing. Folksodriven tags are used to categorize

salient data points so they can be fed to a machine-learning system and

“featurizing” the data.

A Joint Speaker-Listener-Reinforcer Model for Referring Expressions

Licheng Yu , Hao Tan , Mohit Bansal , Tamara L. Berg

Comments: 11 pages, 6 figures

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

Referring expressions are natural language constructions used to identify

particular objects within a scene. In this paper, we propose a unified

framework for the tasks of referring expression comprehension and generation.

Our model is composed of three modules: speaker, listener, and reinforcer. The

speaker generates referring expressions, the listener comprehends referring

expressions, and the reinforcer introduces a reward function to guide sampling

of more discriminative expressions. The listener-speaker modules are trained

jointly in an end-to-end learning framework, allowing the modules to be aware

of one another during learning while also benefiting from the discriminative

reinforcer’s feedback. We demonstrate that this unified framework and training

achieves state-of-the-art results for both comprehension and generation on

three referring expression datasets. Project and demo page:

this https URL

PAMPO: using pattern matching and pos-tagging for effective Named Entities recognition in Portuguese

Conceição Rocha , Alípio Jorge , Roberta Sionara , Paula Brito , Carlos Pimenta , Solange Rezende Subjects : Information Retrieval (cs.IR) ; Computation and Language (cs.CL)

This paper deals with the entity extraction task (named entity recognition)

of a text mining process that aims at unveiling non-trivial semantic

structures, such as relationships and interaction between entities or

communities. In this paper we present a simple and efficient named entity

extraction algorithm. The method, named PAMPO (PAttern Matching and POs tagging

based algorithm for NER), relies on flexible pattern matching, part-of-speech

tagging and lexical-based rules. It was developed to process texts written in

Portuguese, however it is potentially applicable to other languages as well.

We compare our approach with current alternatives that support Named Entity

Recognition (NER) for content written in Portuguese. These are Alchemy, Zemanta

and Rembrandt. Evaluation of the efficacy of the entity extraction method on

several texts written in Portuguese indicates a considerable improvement on

(recall) and (F_1) measures.

Distributed, Parallel, and Cluster Computing

Correctness of Hierarchical MCS Locks with Timeout

Milind Chabbi , Abdelhalim Amer , Shasha Wen , Xu Liu Subjects : Distributed, Parallel, and Cluster Computing (cs.DC)

This manuscript serves as a correctness proof of the Hierarchical MCS locks

with Timeout (HMCS-T) described in our paper titled “An Efficient

Abortable-locking Protocol for Multi-level NUMA Systems” appearing in the

proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of

Parallel Programming.

HMCS-T is a very involved protocol. The system is stateful; the values of

prior acquisition efforts affect the subsequent acquisition efforts. Also, the

status of successors, predecessors, ancestors, and descendants affect steps

followed by the protocol. The ability to make the protocol fully non-blocking

leads to modifications to the exttt{next} field, which causes deviation from

the original MCS lock protocol both in acquisition and release. At several

places, unconditional field updates are replaced with SWAP or CAS operations.

We follow a multi-step approach to prove the correctness of HMCS-T. To

demonstrate the correctness of the HMCS-T lock, we use the Spin model checking.

Model checking causes a combinatorial explosion even to simulate a handful of

threads. First, we understand the minimal, sufficient configurations necessary

to prove safety properties of a single level of lock in the tree. We construct

HMCS-T locks that represent these configurations. We model check these

configurations, which proves the correctness of components of an HMCS-T lock.

Finally, building upon these facts, we argue logically for the correctness of

HMCS-T<n>.

The Balance Attack Against Proof-Of-Work Blockchains: The R3 Testbed as an Example

Christopher Natoli , Vincent Gramoli Subjects : Distributed, Parallel, and Cluster Computing (cs.DC) ; Cryptography and Security (cs.CR)

In this paper, we identify a new form of attack, called the Balance attack,

against proof-of-work blockchain systems. The novelty of this attack consists

of delaying network communications between multiple subgroups of nodes with

balanced mining power. Our theoretical analysis captures the precise tradeoff

between the network delay and the mining power of the attacker needed to double

spend in Ethereum with high probability.

We quantify our probabilistic analysis with statistics taken from the R3

consortium, and show that a single machine needs 20 minutes to attack the

consortium. Finally, we run an Ethereum private chain in a distributed system

with similar settings as R3 to demonstrate the feasibility of the approach, and

discuss the application of the Balance attack to Bitcoin. Our results clearly

confirm that main proof-of-work blockchain protocols can be badly suited for

consortium blockchains.

Learning

Linking the Neural Machine Translation and the Prediction of Organic Chemistry Reactions

Juno Nam , Jurae Kim

Comments: 19 pages, 5 figures

Subjects

:

Learning (cs.LG)

Finding the main product of a chemical reaction is one of the important

problems of organic chemistry. This paper describes a method of applying a

neural machine translation model to the prediction of organic chemical

reactions. In order to translate ‘reactants and reagents’ to ‘products’, a

gated recurrent unit based sequence-to-sequence model and a parser to generate

input tokens for model from reaction SMILES strings were built. Training sets

are composed of reactions from the patent databases, and reactions manually

generated applying the elementary reactions in an organic chemistry textbook of

Wade. The trained models were tested by examples and problems in the textbook.

The prediction process does not need manual encoding of rules (e.g., SMARTS

transformations) to predict products, hence it only needs sufficient training

reaction sets to learn new types of reactions.

Adaptive Lambda Least-Squares Temporal Difference Learning

Timothy A. Mann , Hugo Penedones , Shie Mannor , Todd Hester Subjects : Learning (cs.LG) ; Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

Temporal Difference learning or TD((lambda)) is a fundamental algorithm in

the field of reinforcement learning. However, setting TD’s (lambda) parameter,

which controls the timescale of TD updates, is generally left up to the

practitioner. We formalize the (lambda) selection problem as a bias-variance

trade-off where the solution is the value of (lambda) that leads to the

smallest Mean Squared Value Error (MSVE). To solve this trade-off we suggest

applying Leave-One-Trajectory-Out Cross-Validation (LOTO-CV) to search the

space of (lambda) values. Unfortunately, this approach is too computationally

expensive for most practical applications. For Least Squares TD (LSTD) we show

that LOTO-CV can be implemented efficiently to automatically tune (lambda) and

apply function optimization methods to efficiently search the space of

(lambda) values. The resulting algorithm, ALLSTD, is parameter free and our

experiments demonstrate that ALLSTD is significantly computationally faster

than the na”{i}ve LOTO-CV implementation while achieving similar performance.

Automatic Discoveries of Physical and Semantic Concepts via Association Priors of Neuron Groups

Shuai Li , Kui Jia , Xiaogang Wang Subjects : Learning (cs.LG)

The recent successful deep neural networks are largely trained in a

supervised manner. It {it associates} complex patterns of input samples with

neurons in the last layer, which form representations of {it concepts}. In

spite of their successes, the properties of complex patterns associated a

learned concept remain elusive. In this work, by analyzing how neurons are

associated with concepts in supervised networks, we hypothesize that with

proper priors to regulate learning, neural networks can automatically associate

neurons in the intermediate layers with concepts that are aligned with real

world concepts, when trained only with labels that associate concepts with top

level neurons, which is a plausible way for unsupervised learning. We develop a

prior to verify the hypothesis and experimentally find the proposed priors help

neural networks automatically learn both basic physical concepts at the lower

layers, e.g., rotation of filters, and highly semantic concepts at the higher

layers, e.g., fine-grained categories of an entry-level category.

The Neural Hawkes Process: A Neurally Self-Modulating Multivariate Point Process

Hongyuan Mei , Jason Eisner Subjects : Learning (cs.LG) ; Machine Learning (stat.ML)

Many events occur in the world. Some event types are stochastically excited

or inhibited—in the sense of having their probabilities elevated or

decreased—by patterns in the sequence of previous events. Discovering such

patterns can help us predict which type of event will happen next and when.

Learning such structure should benefit various applications, including medical

prognosis, consumer behavior, and social media activity prediction. We propose

to model streams of discrete events in continuous time, by constructing a

neurally self-modulating multivariate point process. This generative model

allows past events to influence the future in complex ways, by conditioning

future event intensities on the hidden state of a recurrent neural network that

has consumed the stream of past events. We evaluate our model on multiple

datasets and show that it significantly outperforms other strong baselines.

Counterfactual Prediction with Deep Instrumental Variables Networks

Jason Hartford , Greg Lewis , Kevin Leyton-Brown , Matt Taddy Subjects : Applications (stat.AP) ; Learning (cs.LG); Machine Learning (stat.ML)

We are in the middle of a remarkable rise in the use and capability of

artificial intelligence. Much of this growth has been fueled by the success of

deep learning architectures: models that map from observables to outputs via

multiple layers of latent representations. These deep learning algorithms are

effective tools for unstructured prediction, and they can be combined in AI

systems to solve complex automated reasoning problems. This paper provides a

recipe for combining ML algorithms to solve for causal effects in the presence

of instrumental variables — sources of treatment randomization that are

conditionally independent from the response. We show that a flexible IV

specification resolves into two prediction tasks that can be solved with deep

neural nets: a first-stage network for treatment prediction and a second-stage

network whose loss function involves integration over the conditional treatment

distribution. This Deep IV framework imposes some specific structure on the

stochastic gradient descent routine used for training, but it is general enough

that we can take advantage of off-the-shelf ML capabilities and avoid extensive

algorithm customization. We outline how to obtain out-of-sample causal

validation in order to avoid over-fit. We also introduce schemes for both

Bayesian and frequentist inference: the former via a novel adaptation of

dropout training, and the latter via a data splitting routine.

Data driven estimation of Laplace-Beltrami operator

Frédéric Chazal (DATASHAPE), Ilaria Giulini (DATASHAPE), Bertrand Michel (LSTA)

Journal-ref: 30th Conference on Neural Information Processing Systems (NIPS

2016), Dec 2016, Barcelona, Spain. 30th Conference on Neural Information

Processing Systems (NIPS 2016), Barcelona, Spain., 2016

Subjects

:

Computational Geometry (cs.CG)

; Learning (cs.LG); Statistics Theory (math.ST)

Approximations of Laplace-Beltrami operators on manifolds through graph

Lapla-cians have become popular tools in data analysis and machine learning.

These discretized operators usually depend on bandwidth parameters whose tuning

remains a theoretical and practical problem. In this paper, we address this

problem for the unnormalized graph Laplacian by establishing an oracle

inequality that opens the door to a well-founded data-driven procedure for the

bandwidth selection. Our approach relies on recent results by Lacour and

Massart [LM15] on the so-called Lepski’s method.

Information Theory

When the map is better than the territory

Erik P Hoel

Comments: 15 pages, 6 figures

Subjects

:

Information Theory (cs.IT)

; Artificial Intelligence (cs.AI)

The causal structure of any system can be analyzed at a multitude of spatial

and temporal scales. It has long been thought that while higher scale (macro)

descriptions of causal structure may be useful to observers, they are at best a

compressed description and at worse leave out critical information. However,

recent research applying information theory to causal analysis has shown that

the causal structure of some systems can actually come into focus (be more

informative) at a macroscale (Hoel et al. 2013). That is, a macro model of a

system (a map) can be more informative than a fully detailed model of the

system (the territory). This has been called causal emergence. While causal

emergence may at first glance seem counterintuitive, this paper grounds the

phenomenon in a classic concept from information theory: Shannon’s discovery of

the channel capacity. I argue that systems have a particular causal capacity,

and that different causal models of those systems take advantage of that

capacity to various degrees. For some systems, only macroscale causal models

use the full causal capacity. Such macroscale causal models can either be

coarse-grains, or may leave variables and states out of the model (exogenous)

in various ways, which can improve the model’s efficacy and its informativeness

via the same mathematical principles of how error-correcting codes take

advantage of an information channel’s capacity. As model choice increase, the

causal capacity of a system approaches the channel capacity. Ultimately, this

provides a general framework for understanding how the causal structure of some

systems cannot be fully captured by even the most detailed microscopic model.

Unified Theory for Recovery of Sparse Signals in a General Transform Domain

Kiryung Lee , Yanjun Li , Kyong Hwan Jin , Jong Chul Ye

Comments: 41 pages, 3 figures

Subjects

:

Information Theory (cs.IT)

Compressed sensing provided a new sampling paradigm for sparse signals.

Remarkably, it has been shown that practical algorithms provide robust recovery

from noisy linear measurements acquired at a near optimal sample rate. In

real-world applications, a signal of interest is typically sparse not in the

canonical basis but in a certain transform domain, such as the wavelet or the

finite difference domain. The theory of compressed sensing was extended to this

analysis sparsity model. However, existing theoretic results for the analysis

sparsity model have limitations in the following senses: the derivation relies

on particular choices of sensing matrix and sparsifying transform; sample

complexity is significantly suboptimal compared to the analogous result in

compressed sensing with the canonical sparsity model. In this paper, we propose

a novel unified theory for robust recovery of sparse signals in a general

transform domain using convex relaxation methods that overcomes the

aforementioned limitations. Our results do not rely on any specific choice of

sensing matrix and sparsifying transform and provide near optimal sample

complexity when the transforms satisfy certain conditions. In particular, our

theory extends existing near optimal sample complexity results to a wider class

of sensing matrix and sparsifying transform. We also provide extensions of our

results to the scenarios where the atoms in the transform has varying

incoherence parameters and the unknown signal exhibits a structured sparsity

pattern. Numerical results show that the proposed variable density sampling

provides superior recovery performance over previous works, which is aligned

with the presented theory.

A Scalable Framework for CSI Feedback in FDD Massive MIMO via DL Path Aligning

Xiliang Luo , Penghao Cai , Xiaoyu Zhang , Die Hu , Cong Shen

Comments: 13 pages, 8 figures

Subjects

:

Information Theory (cs.IT)

Unlike the time-division duplexing (TDD) systems, the downlink (DL) and

uplink (UL) channels are not reciprocal anymore in the case of

frequency-division duplexing (FDD). However, some long-term parameters, e.g.

the time delays and angles of arrival (AoAs) of the channel paths, still enjoy

reciprocity. In this paper, by efficiently exploiting the aforementioned

limited reciprocity, we address the DL channel state information (CSI) feedback

in a practical wideband massive multiple-input multiple-output (MIMO) system

operating in the FDD mode. With orthogonal frequency-division multiplexing

(OFDM) waveform and assuming frequency-selective fading channels, we propose a

scalable framework for the DL pilots design, DL CSI acquisition, and the

corresponding CSI feedback in the UL. In particular, the base station (BS) can

transmit the FFT-based pilots with the carefully-selected phase shifts. Then

the user can rely on the so-called time-domain aggregate channel (TAC) to

derive the feedback of reduced imensionality according to either its own

knowledge about the statistics of the DL channels or the instruction from the

serving BS. We demonstrate that each user can just feed back one scalar number

per DL channel path for the BS to recover the DL CSIs. Comprehensive numerical

results further corroborate our designs.

Optimal Transmission Policies for Multi-hop Energy Harvesting Systems

Milad Rezaee , Mahtab Mirmohseni , Vaneet Aggarwal (Senior Member, IEEE), Mohammad Reza Aref

Comments: paper has been submitted to IEEE Transactions on Wireless Communications

Subjects

:

Information Theory (cs.IT)

In this paper, we consider a multi-hop energy harvesting (EH) communication

system in a full-duplex mode, where arrival data and harvested energy curves in

the source and the relays are modeled as general functions. This model includes

the EH system with discrete arrival processes as a special case. We investigate

the throughput maximization problem considering minimum utilized energy in the

source and relays and find the optimal offline algorithm. We show that the

optimal solution of the two-hop transmission problem have three main steps: (i)

Solving a point-to-point throughput maximization problem at the source; (ii)

Solving a point-to-point throughput maximization problem at the relay (after

applying the solution of first step as the input of this second problem); (iii)

Minimizing utilized energy in the source. In addition, we show that how the

optimal algorithm for the completion time minimization problem can be derived

from the proposed algorithm for throughput maximization problem. Also, for the

throughput maximization problem, we propose an online algorithm and show that

it is more efficient than the benchmark one (which is a direct application of

an existing point-to-point online algorithm to the multi-hop system).

The ((n,m,k,λ))-Strong External Difference Family with (m geq 5) Exists

Jiejing Wen , Minghui Yang , Keqin Feng

Comments: 6 pages

Subjects

:

Information Theory (cs.IT)

The notion of strong external difference family (SEDF) in a finite abelian

group ((G,+)) is raised by M. B. Paterson and D. R. Stinson [5] in 2016 and

motivated by its application in communication theory to construct (R)-optimal

regular algebraic manipulation detection code. A series of

((n,m,k,lambda))-SEDF’s have been constructed in [5, 4, 2, 1] with (m=2). In

this note we present an example of (243, 11, 22, 20)-SEDF in finite field

(mathbb{F}_q) ((q=3^5=243).) This is an answer for the following problem

raised in [5] and continuously asked in [4, 2, 1]: if there exists an

((n,m,k,lambda))-SEDF for (mgeq 5).

Generalized Hamming weights of three classes of linear codes

Gaopeng Jian Subjects : Information Theory (cs.IT)

The generalized Hamming weights of a linear code have been extensively

studied since Wei first use them to characterize the cryptography performance

of a linear code over the wire-tap channel of type II. In this paper, we

investigate the generalized Hamming weights of three classes of linear codes

constructed through defining sets and determine them partly for some cases.

Particularly, in the semiprimitive case we solve an problem left in Yang et al.

(IEEE Trans. Inf. Theory 61(9): 4905–4913, 2015).

Graph Information Ratio

Lele Wang , Ofer Shayevitz Subjects : Information Theory (cs.IT) ; Discrete Mathematics (cs.DM); Combinatorics (math.CO)

We introduce the notion of information ratio ( ext{Ir}(H/G)) between two

(simple, undirected) graphs (G) and (H), defined as the supremum of ratios

(k/n) such that there exists a mapping between the strong products (G^k) to

(H^n) that preserves non-adjacency. Operationally speaking, the information

ratio is the maximal number of source symbols per channel use that can be

reliably sent over a channel with a confusion graph (H), where reliability is

measured w.r.t. a source confusion graph (G). Various results are provided,

including in particular lower and upper bounds on ( ext{Ir}(H/G)) in terms of

different graph properties, inequalities and identities for behavior under

strong product and disjoint union, relations to graph cores, and notions of

graph criticality. Informally speaking, ( ext{Ir}(H/G)) can be interpreted as

a measure of similarity between (G) and (H). We make this notion precise by

introducing the concept of information equivalence between graphs, a more

quantitative version of homomorphic equivalence. We then describe a natural

partial ordering over the space of information equivalence classes, and endow

it with a suitable metric structure that is contractive under the strong

product. Various examples and open problems are discussed.

A Brief Introduction on Shannon's Information Theory

Ricky X. F. Chen

Comments: A brief introduction to Shannon’s information theory in a combinatorial flavor, completed for quite a while. Comments are welcome

Subjects

:

Information Theory (cs.IT)

; Combinatorics (math.CO)

This is an introduction to Shannon’s Information Theory. It is more like a

long note so that it is by no means a complete survey or completely

mathematically rigorous. It covers two main topics: entropy and channel

capacity. All related concepts will be developed in a totally combinatorial

flavor. Some issues usually not addressed in the literature will be discussed

here as well. Hopefully, it will be interesting to those interested in

Information Theory.

Channel Measurements and Models for High-Speed Train Wireless Communication Systems in Tunnel Scenarios: A Survey

Yu Liu , Ammar Ghazal , Cheng-Xiang Wang , Xiaohu Ge , Yang Yang , Yapei Zhang Subjects : Networking and Internet Architecture (cs.NI) ; Information Theory (cs.IT)

The rapid developments of high-speed trains (HSTs) introduce new challenges

to HST wireless communication systems. Realistic HST channel models play a

critical role in designing and evaluating HST communication systems. Due to the

length limitation, bounding of tunnel itself, and waveguide effect, channel

characteristics in tunnel scenarios are very different from those in other HST

scenarios. Therefore, accurate tunnel channel models considering both

large-scale and small-scale fading characteristics are essential for HST

communication systems. Moreover, certain characteristics of tunnel channels

have not been investigated sufficiently. This article provides a comprehensive

review of the measurement campaigns in tunnels and presents some tunnel channel

models using various modeling methods. Finally, future directions in HST tunnel

channel measurements and modeling are discussed.

欢迎加入我爱机器学习QQ8群:19517895

arXiv Paper Daily: Mon, 2 Jan 2017

微信扫一扫,关注我爱机器学习公众号

微博:我爱机器学习

原文  https://www.52ml.net/21454.html
正文到此结束
Loading...