转载

arXiv Paper Daily: Fri, 27 Jan 2017

Computer Vision and Pattern Recognition

Pose Invariant Embedding for Deep Person Re-identification

Liang Zheng , Yujia Huang , Huchuan Lu , Yi Yang Subjects : Computer Vision and Pattern Recognition (cs.CV)

Pedestrian misalignment, which mainly arises from detector errors and pose

variations, is a critical problem for a robust person re-identification (re-ID)

system. With bad alignment, the background noise will significantly compromise

the feature learning and matching process. To address this problem, this paper

introduces the pose invariant embedding (PIE) as a pedestrian descriptor.

First, in order to align pedestrians to a standard pose, the PoseBox structure

is introduced, which is generated through pose estimation followed by affine

transformations. Second, to reduce the impact of pose estimation errors and

information loss during PoseBox construction, we design a PoseBox fusion (PBF)

CNN architecture that takes the original image, the PoseBox, and the pose

estimation confidence as input. The proposed PIE descriptor is thus defined as

the fully connected layer of the PBF network for the retrieval task.

Experiments are conducted on the Market-1501, CUHK03, and VIPeR datasets. We

show that PoseBox alone yields decent re-ID accuracy and that when integrated

in the PBF network, the learned PIE descriptor produces competitive performance

compared with the state-of-the-art approaches.

Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro

Zhedong Zheng , Liang Zheng , Yi Yang Subjects : Computer Vision and Pattern Recognition (cs.CV)

In this paper, we mainly contribute a simple semi-supervised pipeline which

only uses the original training set without extra data collection. It is

challenging in 1) how to obtain more training data only from the training set

and 2) how to use the newly generated data. In this work, the generative

adversarial networks (GANs) are used to generate unlabeled samples. We propose

the label smoothing regularization for outliers (LSRO) scheme. It assigns a

uniform label distribution to the unlabeled images, which regularizes the

supervised model and improves a ResNet baseline.

We verify the proposed method on a practical task: person re-identification

(re-ID). This task aims to retrieve the query person from other cameras. We

adopt DCGAN for sample generation, and a baseline convolutional neural network

(CNN) for embedding learning. In our experiment, we show that adding the

GAN-generated data effectively improves the discriminative ability of the

learned feature embedding. We evaluate the re-ID performance on two large-scale

datasets: Market1501 and CUHK03. We obtain +4.37% and +1.6% improvement in

rank-1 precision over the CNN baseline on Market1501 and CUHK03, respectively.

Super-resolution Using Constrained Deep Texture Synthesis

Libin Sun , James Hays

Comments: 13 pages, 11 figures

Subjects

Computer Vision and Pattern Recognition (cs.CV)

Hallucinating high frequency image details in single image super-resolution

is a challenging task. Traditional super-resolution methods tend to produce

oversmoothed output images due to the ambiguity in mapping between low and high

resolution patches. We build on recent success in deep learning based texture

synthesis and show that this rich feature space can facilitate successful

transfer and synthesis of high frequency image details to improve the visual

quality of super-resolution results on a wide variety of natural textures and

images.

Sparse Ternary Codes for similarity search have higher coding gain than dense binary codes

Sohrab Ferdowsi , Slava Voloshynovskiy , Dimche Kostadinov , Taras Holotyak

Comments: Submitted to ISIT 2017

Subjects

Information Theory (cs.IT)

; Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)

This paper addresses the problem of Approximate Nearest Neighbor (ANN) search

in pattern recognition where feature vectors in a database are encoded as

compact codes in order to speed-up the similarity search in large-scale

databases. Considering the ANN problem from an information-theoretic

perspective, we interpret it as an encoding which maps the original feature

vectors to a less-entropic sparse representation while requiring them to be as

informative as possible. We then define the coding gain for ANN search using

information-theoretic measures. We next show that the classical approach to

this problem which consists of binarization of the projected vectors is

sub-optimal. Instead, we show that a recently proposed ternary encoding

achieves higher coding gains.

Learning Word-Like Units from Joint Audio-Visual Analysis

David Harwath , James R. Glass Subjects : Computation and Language (cs.CL) ; Computer Vision and Pattern Recognition (cs.CV)

Given a collection of images and spoken audio captions, we present a method

for discovering word-like acoustic units in the continuous speech signal and

grounding them to semantically relevant image regions. For example, our model

is able to detect spoken instances of the word ‘lighthouse’ within an utterance

and associate them with image regions containing lighthouses. We do not use any

form of conventional automatic speech recognition, nor do we use any text

transcriptions or conventional linguistic annotations. Our model effectively

implements a form of spoken language acquisition, in which the computer learns

not only to recognize word categories by sound, but also to enrich the words it

learns with semantics by grounding them in images.

Artificial Intelligence

Ethical Considerations in Artificial Intelligence Courses

Emanuelle Burton , Judy Goldsmith , Sven Koenig , Benjamin Kuipers , Nicholas Mattei , Toby Walsh

Comments: 29 pages including all case studies and links to video media on YouTube

Subjects

Artificial Intelligence (cs.AI)

; Computers and Society (cs.CY); General Literature (cs.GL)

The recent surge in interest in ethics in artificial intelligence may leave

many educators wondering how to address moral, ethical, and philosophical

issues in their AI courses. As instructors we want to develop curriculum that

not only prepares students to be artificial intelligence practitioners, but

also to understand the moral, ethical, and philosophical impacts that

artificial intelligence will have on society. In this article we provide

practical case studies and links to resources for use by AI educators. We also

provide concrete suggestions on how to integrate AI ethics into a general

artificial intelligence course and how to teach a stand-alone artificial

intelligence ethics course.

Dynamic time warping distance for message propagation classification in Twitter

Siwar Jendoubi , Arnaud Martin , Ludovic Liétard , Boutheina Ben Yaghlane , Hend Ben Hadji

Comments: 10 pages, 1 figure ECSQARU 2015, Proceedings of the 13th European Conferences on Symbolic and Quantitative Approaches to Reasoning with Uncertainty, 2015

Subjects

Artificial Intelligence (cs.AI)

; Social and Information Networks (cs.SI); Machine Learning (stat.ML)

Social messages classification is a research domain that has attracted the

attention of many researchers in these last years. Indeed, the social message

is different from ordinary text because it has some special characteristics

like its shortness. Then the development of new approaches for the processing

of the social message is now essential to make its classification more

efficient. In this paper, we are mainly interested in the classification of

social messages based on their spreading on online social networks (OSN). We

proposed a new distance metric based on the Dynamic Time Warping distance and

we use it with the probabilistic and the evidential k Nearest Neighbors (k-NN)

classifiers to classify propagation networks (PrNets) of messages. The

propagation network is a directed acyclic graph (DAG) that is used to record

propagation traces of the message, the traversed links and their types. We

tested the proposed metric with the chosen k-NN classifiers on real world

propagation traces that were collected from Twitter social network and we got

good classification accuracies.

Identifying Consistent Statements about Numerical Data with Dispersion-Corrected Subgroup Discovery

Mario Boley , Bryan R. Goldsmith , Luca M. Ghiringhelli , Jilles Vreeken Subjects : Artificial Intelligence (cs.AI) ; Databases (cs.DB)

Existing algorithms for subgroup discovery with numerical targets do not

optimize the error or target variable dispersion of the groups they find. This

often leads to unreliable or inconsistent statements about the data, rendering

practical applications, especially in scientific domains, futile. Therefore, we

here extend the optimistic estimator framework for optimal subgroup discovery

to a new class of objective functions: we show how tight estimators can be

computed efficiently for all functions that are determined by subgroup size

(non-decreasing dependence), the subgroup median value, and a dispersion

measure around the median (non-increasing dependence). In the important special

case when dispersion is measured using the average absolute deviation from the

median, this novel approach yields a linear time algorithm. Empirical

evaluation on a wide range of datasets shows that, when used within

branch-and-bound search, this approach is highly efficient and indeed discovers

subgroups with much smaller errors.

Logic Programming Petri Nets

Giovanni Sileno

Comments: draft version

Subjects

Artificial Intelligence (cs.AI)

With the purpose of modeling, specifying and reasoning in an integrated

fashion with procedural and declarative aspects (both commonly present in cases

or scenarios), the paper introduces Logic Programming Petri Nets (LPPN), an

extension to the Petri Net notation providing an interface to logic programming

constructs. Two semantics are presented. First, a hybrid operational semantics

that separates the process component, treated with Petri nets, from the

constraint/terminological component, treated with Answer Set Programming (ASP).

Second, a denotational semantics maps the notation to ASP fully, via Event

Calculus. These two alternative specifications enable a preliminary evaluation

in terms of reasoning efficiency.

Information Retrieval

Learning to Effectively Select Topics For Information Retrieval Test Collections

Mucahid Kutlu , Tamer Elsayed , Matthew Lease Subjects : Information Retrieval (cs.IR)

Employing test collections is a common way to evaluate the effectiveness of

the information retrieval systems. However, due to high cost of constructing

test collections, many researchers have proposed new methods to reduce this

cost. Guiver, Mizzaro, and Robertson [19] show that some topics are better than

others in terms of evaluation. Inspired by their work, we focus on finding a

good subset of topics from the given topic pool. We develop a learning-to-rank

based topic selection method. In our experiments with TREC Robust 2003 and

Robust 2004 test collections, we show that we are able to better select topics

with our approach vs. prior work. We also compare deep and narrow vs. wide and

shallow judging in terms of evaluation reliability and reusability. When we

select topics randomly, we find that shallow judging is preferable, as

confirming previous work. However, if we select topics intelligently, we are

able to increase reliability and reusability of test collections by reducing

topics while using more judgments per topic.

Match-Tensor: a Deep Relevance Model for Search

Aaron Jaech , Hetunandan Kamisetty , Eric Ringger , Charlie Clarke Subjects : Information Retrieval (cs.IR) ; Computation and Language (cs.CL)

The application of Deep Neural Networks for ranking in search engines may

obviate the need for the extensive feature engineering common to current

learning-to-rank methods. However, we show that combining simple relevance

matching features like BM25 with existing Deep Neural Net models often

substantially improves the accuracy of these models, indicating that they do

not capture essential local relevance matching signals. We describe a novel

deep Recurrent Neural Net-based model that we call Match-Tensor. The

architecture of the Match-Tensor model simultaneously accounts for both local

relevance matching and global topicality signals allowing for a rich interplay

between them when computing the relevance of a document to a query. On a large

held-out test set consisting of social media documents, we demonstrate not only

that Match-Tensor outperforms BM25 and other classes of DNNs but also that it

largely subsumes signals present in these models.

Private Information Retrieval from MDS Coded Data with Colluding Servers: Settling a Conjecture by Freij-Hollanti et al

Hua Sun , Syed A. Jafar Subjects : Information Theory (cs.IT) ; Cryptography and Security (cs.CR); Information Retrieval (cs.IR)

A ((K, N, T, K_c)) instance of the MDS-TPIR problem is comprised of (K)

messages and (N) distributed servers. Each message is separately encoded

through a ((K_c, N)) MDS storage code. A user wishes to retrieve one message,

as efficiently as possible, while revealing no information about the desired

message index to any colluding set of up to (T) servers. The fundamental limit

on the efficiency of retrieval, i.e., the capacity of MDS-TPIR is known only at

the extremes where either (T) or (K_c) belongs to ({1,N}). The focus of this

work is a recent conjecture by Freij-Hollanti, Gnilke, Hollanti and Karpuk

which offers a general capacity expression for MDS-TPIR. We prove that the

conjecture is false by presenting as a counterexample a PIR scheme for the

setting ((K, N, T, K_c) = (2,4,2,2)), which achieves the rate (3/5), exceeding

the conjectured capacity, (4/7). Insights from the counterexample lead us to

capacity characterizations for various instances of MDS-TPIR including all

cases with ((K, N, T, K_c) = (2,N,T,N-1)), where (N) and (T) can be arbitrary.