转载

arXiv Paper Daily: Mon, 23 Apr 2018

Neural and Evolutionary Computing

An Investigation of Environmental Influence on the Benefits of Adaptation Mechanisms in Evolutionary Swarm Robotics

Andreas Steyven , Emma Hart , Ben Paechter

Comments: In GECCO 2017

Subjects

:

Neural and Evolutionary Computing (cs.NE)

A robotic swarm that is required to operate for long periods in a potentially

unknown environment can use both evolution and individual learning methods in

order to adapt. However, the role played by the environment in influencing the

effectiveness of each type of learning is not well understood. In this paper,

we address this question by analysing the performance of a swarm in a range of

simulated, dynamic environments where a distributed evolutionary algorithm for

evolving a controller is augmented with a number of different individual

learning mechanisms. The learning mechanisms themselves are defined by

parameters which can be either fixed or inherited. We conduct experiments in a

range of dynamic environments whose characteristics are varied so as to present

different opportunities for learning. Results enable us to map environmental

characteristics to the most effective learning algorithm.

Evolution of a Functionally Diverse Swarm via a Novel Decentralised Quality-Diversity Algorithm

Emma Hart , Andreas S.W. Steyven , Ben Paechter

Comments: In GECCO 2018

Subjects

:

Neural and Evolutionary Computing (cs.NE)

The presence of functional diversity within a group has been demonstrated to

lead to greater robustness, higher performance and increased problem-solving

ability in a broad range of studies that includes insect groups, human groups

and swarm robotics. Evolving group diversity however has proved challenging

within Evolutionary Robotics, requiring reproductive isolation and careful

attention to population size and selection mechanisms. To tackle this issue, we

introduce a novel, decentralised, variant of the MAP-Elites illumination

algorithm which is hybridised with a well-known distributed evolutionary

algorithm (mEDEA). The algorithm simultaneously evolves multiple diverse

behaviours for multiple robots, with respect to a simple token-gathering task.

Each robot in the swarm maintains a local archive defined by two pre-specified

functional traits which is shared with robots it come into contact with. We

investigate four different strategies for sharing, exploiting and combining

local archives and compare results to mEDEA. Experimental results show that in

contrast to previous claims, it is possible to evolve a functionally diverse

swarm without geographical isolation, and that the new method outperforms mEDEA

in terms of the diversity, coverage and precision of the evolved swarm.

Minimizing Area and Energy of Deep Learning Hardware Design Using Collective Low Precision and Structured Compression

Shihui Yin , Gaurav Srivastava , Shreyas K. Venkataramanaiah , Chaitali Chakrabarti , Visar Berisha , Jae-sun Seo

Comments: 2017 Asilomar Conference on Signals, Systems and Computers

Subjects

:

Neural and Evolutionary Computing (cs.NE)

Deep learning algorithms have shown tremendous success in many recognition

tasks; however, these algorithms typically include a deep neural network (DNN)

structure and a large number of parameters, which makes it challenging to

implement them on power/area-constrained embedded platforms. To reduce the

network size, several studies investigated compression by introducing

element-wise or row-/column-/block-wise sparsity via pruning and

regularization. In addition, many recent works have focused on reducing

precision of activations and weights with some reducing down to a single bit.

However, combining various sparsity structures with binarized or

very-low-precision (2-3 bit) neural networks have not been comprehensively

explored. In this work, we present design techniques for minimum-area/-energy

DNN hardware with minimal degradation in accuracy. During training, both

binarization/low-precision and structured sparsity are applied as constraints

to find the smallest memory footprint for a given deep learning algorithm. The

DNN model for CIFAR-10 dataset with weight memory reduction of 50X exhibits

accuracy comparable to that of the floating-point counterpart. Area,

performance and energy results of DNN hardware in 40nm CMOS are reported for

the MNIST dataset. The optimized DNN that combines 8X structured compression

and 3-bit weight precision showed 98.4% accuracy at 20nJ per classification.

A Simple Quantum Neural Net with a Periodic Activation Function

Ammar Daskin

Comments: conference paper

Subjects

:

Quantum Physics (quant-ph)

; Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

In this paper, we propose a simple neural net that requires only (O(nlog_2k))

numbers of quantum gates and qubits: Here, (n) is the number of input

parameters, and (k) is the number of weights applied to these input parameters

in the proposed neural net. We describe the network in terms of a quantum

circuit, and then draw its equivalent classical neural net which involves

(O(k^n)) nodes in the hidden layer. Then, we show that the network uses a

periodic activation function of cosine values of the linear combinations of the

inputs and weights. The steps of the gradient descent are described, and then

Iris and Breast cancer datasets are used for the numerical simulations. The

numerical results indicate the network can be used in machine learning problems

and it may provide exponential speedup over the same structured classical

neural net.

Computer Vision and Pattern Recognition

Synthesizing Images of Humans in Unseen Poses

Guha Balakrishnan , Amy Zhao , Adrian V. Dalca , Fredo Durand , John Guttag

Comments: CVPR 2018

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

We address the computational problem of novel human pose synthesis. Given an

image of a person and a desired pose, we produce a depiction of that person in

that pose, retaining the appearance of both the person and background. We

present a modular generative neural network that synthesizes unseen poses using

training pairs of images and poses taken from human action videos. Our network

separates a scene into different body part and background layers, moves body

parts to new locations and refines their appearances, and composites the new

foreground with a hole-filled background. These subtasks, implemented with

separate modules, are trained jointly using only a single target image as a

supervised label. We use an adversarial discriminator to force our network to

synthesize realistic details conditioned on pose. We demonstrate image

synthesis results on three action classes: golf, yoga/workouts and tennis, and

show that our method produces accurate results within action classes as well as

across action classes. Given a sequence of desired poses, we also produce

coherent videos of actions.

ADef: an Iterative Algorithm to Construct Adversarial Deformations

Rima Alaifari , Giovanni S. Alberti , Tandri Gauksson

Comments: 10 pages, 5 figures

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Cryptography and Security (cs.CR); Learning (cs.LG); Machine Learning (stat.ML)

While deep neural networks have proven to be a powerful tool for many

recognition and classification tasks, their stability properties are still not

well understood. In the past, image classifiers have been shown to be

vulnerable to so-called adversarial attacks, which are created by additively

perturbing the correctly classified image.

In this paper, we propose the ADef algorithm to construct a different kind of

adversarial attack created by iteratively applying small deformations to the

image, found through a gradient descent step. We demonstrate our results on

MNIST with a convolutional neural network and on ImageNet with Inception-v3 and

ResNet-101.

Image Inpainting for Irregular Holes Using Partial Convolutions

Guilin Liu , Fitsum A. Reda , Kevin J. Shih , Ting-Chun Wang , Andrew Tao , Bryan Catanzaro

Comments: 23 pages, includes appendix

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Existing deep learning based image inpainting methods use a standard

convolutional network over the corrupted image, using convolutional filter

responses conditioned on both valid pixels as well as the substitute values in

the masked holes (typically the mean value). This often leads to artifacts such

as color discrepancy and blurriness. Post-processing is usually used to reduce

such artifacts, but are expensive and may fail. We propose the use of partial

convolutions, where the convolution is masked and renormalized to be

conditioned on only valid pixels. We further include a mechanism to

automatically generate an updated mask for the next layer as part of the

forward pass. Our model outperforms other methods for irregular masks. We show

qualitative and quantitative comparisons with other methods to validate our

approach.

Rethinking the Faster R-CNN Architecture for Temporal Action Localization

Yu-Wei Chao , Sudheendra Vijayanarasimhan , Bryan Seybold , David A. Ross , Jia Deng , Rahul Sukthankar

Comments: Accepted in CVPR 2018

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

We propose TAL-Net, an improved approach to temporal action localization in

video that is inspired by the Faster R-CNN object detection framework. TAL-Net

addresses three key shortcomings of existing approaches: (1) we improve

receptive field alignment using a multi-scale architecture that can accommodate

extreme variation in action durations; (2) we better exploit the temporal

context of actions for both proposal generation and action classification by

appropriately extending receptive fields; and (3) we explicitly consider

multi-stream feature fusion and demonstrate that fusing motion late is

important. We achieve state-of-the-art performance for both action proposal and

localization on THUMOS’14 detection benchmark and competitive performance on

ActivityNet challenge.

One-Shot Learning using Mixture of Variational Autoencoders: a Generalization Learning approach

Decebal Constantin Mocanu , Elena Mocanu

Journal-ref: 17th International Conference on Autonomous Agents and Multiagent

Systems (AAMAS 2018)

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Learning (cs.LG); Machine Learning (stat.ML)

Deep learning, even if it is very successful nowadays, traditionally needs

very large amounts of labeled data to perform excellent on the classification

task. In an attempt to solve this problem, the one-shot learning paradigm,

which makes use of just one labeled sample per class and prior knowledge,

becomes increasingly important. In this paper, we propose a new one-shot

learning method, dubbed MoVAE (Mixture of Variational AutoEncoders), to perform

classification. Complementary to prior studies, MoVAE represents a shift of

paradigm in comparison with the usual one-shot learning methods, as it does not

use any prior knowledge. Instead, it starts from zero knowledge and one labeled

sample per class. Afterward, by using unlabeled data and the generalization

learning concept (in a way, more as humans do), it is capable to gradually

improve by itself its performance. Even more, if there are no unlabeled data

available MoVAE can still perform well in one-shot learning classification. We

demonstrate empirically the efficiency of our proposed approach on three

datasets, i.e. the handwritten digits (MNIST), fashion products

(Fashion-MNIST), and handwritten characters (Omniglot), showing that MoVAE

outperforms state-of-the-art one-shot learning algorithms.

MobileFaceNets: Efficient CNNs for Accurate Real-time Face Verification on Mobile Devices

Sheng Chen , Yang Liu , Xiang Gao , Zhen Han

Comments: To be submitted to SPL

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Learning (cs.LG)

In this paper, we present a class of extremely efficient CNN models called

MobileFaceNets, which use no more than 1 million parameters and specifically

tailored for high-accuracy real-time face verification on mobile and embedded

devices. We also make a simple analysis on the weakness of common mobile

networks for face verification. The weakness has been well overcome by our

specifically designed MobileFaceNets. Under the same experimental conditions,

our MobileFaceNets achieve significantly superior accuracy as well as more than

2 times actual speedup over MobileNetV2. After trained by ArcFace loss on the

refined MS-Celeb-1M from scratch, our single MobileFaceNet model of 4.0MB size

achieves 99.55% face verification accuracy on LFW and 92.59% TAR (FAR1e-6) on

MegaFace Challenge 1, which is even comparable to state-of-the-art big CNN

models of hundreds MB size. The fastest one of our MobileFaceNets has an actual

inference time of 18 milliseconds on a mobile phone. Our experiments on LFW,

AgeDB, and MegaFace show that our MobileFaceNets achieve significantly improved

efficiency compared with the state-of-the-art lightweight and mobile CNNs for

face verification.

An Approximate Shading Model with Detail Decomposition for Object Relighting

Zicheng Liao , Kevin Karsch , Hongyi Zhang , David Forsyth Subjects : Computer Vision and Pattern Recognition (cs.CV)

We present an object relighting system that allows an artist to select an

object from an image and insert it into a target scene. Through simple

interactions, the system can adjust illumination on the inserted object so that

it appears naturally in the scene. To support image-based relighting, we build

object model from the image, and propose a emph{perceptually-inspired}

approximate shading model for the relighting. It decomposes the shading field

into (a) a rough shape term that can be reshaded, (b) a parametric shading

detail that encodes missing features from the first term, and (c) a geometric

detail term that captures fine-scale material properties. With this

decomposition, the shading model combines 3D rendering and image-based

composition and allows more flexible compositing than image-based methods.

Quantitative evaluation and a set of user studies suggest our method is a

promising alternative to existing methods of object insertion.

Residual-Guide Feature Fusion Network for Single Image Deraining

Zhiwen Fan , Huafeng Wu , Xueyang Fu , Yue Hunag , Xinghao Ding Subjects : Computer Vision and Pattern Recognition (cs.CV)

Single image rain streaks removal is extremely important since rainy images

adversely affect many computer vision systems. Deep learning based methods have

found great success in image deraining tasks. In this paper, we propose a novel

residual-guide feature fusion network, called ResGuideNet, for single image

deraining that progressively predicts highquality reconstruction. Specifically,

we propose a cascaded network and adopt residuals generated from shallower

blocks to guide deeper blocks. By using this strategy, we can obtain a coarse

to fine estimation of negative residual as the blocks go deeper. The outputs of

different blocks are merged into the final reconstruction. We adopt recursive

convolution to build each block and apply supervision to all intermediate

results, which enable our model to achieve promising performance on synthetic

and real-world data while using fewer parameters than previous required.

ResGuideNet is detachable to meet different rainy conditions. For images with

light rain streaks and limited computational resource at test time, we can

obtain a decent performance even with several building blocks. Experiments

validate that ResGuideNet can benefit other low- and high-level vision tasks.

Graph-based Hypothesis Generation for Parallax-tolerant Image Stitching

Jing Chen , Nan Li , Tianli Liao

Comments: 3 pages, 3 figures, 2 tables

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

The seam-driven approach has been proven fairly effective for

parallax-tolerant image stitching, whose strategy is to search for an invisible

seam from finite representative hypotheses of local alignment. In this paper,

we propose a graph-based hypothesis generation and a seam-guided local

alignment for improving the effectiveness and the efficiency of the seam-driven

approach. The experiment demonstrates the significant reduction of number of

hypotheses and the improved quality of naturalness of final stitching results,

comparing to the state-of-the-art method SEAGULL.

Accurate Deep Direct Geo-Localization from Ground Imagery and Phone-Grade GPS

Shaohui Sun , Ramesh Sarukkai , Jack Kwok , Vinay Shet

Comments: To appear in CVPR 2018 Workshops

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

One of the most critical topics in autonomous driving or ride-sharing

technology is to accurately localize vehicles in the world frame. In addition

to common multi-view camera systems, it usually also relies on industrial grade

sensors, such as LiDAR, differential GPS, high precision IMU, and etc. In this

paper, we develop an approach to provide an effective solution to this problem.

We propose a method to train a geo-spatial deep neural network (CNN+LSTM) to

predict accurate geo-locations (latitude and longitude) using only ordinary

ground imagery and low accuracy phone-grade GPS. We evaluate our approach on

the open dataset released during ACM Multimedia 2017 Grand Challenge. Having

ground truth locations for training, we are able to reach nearly lane-level

accuracy. We also evaluate the proposed method on our own collected images in

San Francisco downtown area often described as “downtown canyon” where consumer

GPS signals are extremely inaccurate. The results show the model can predict

quality locations that suffice in real business applications, such as

ride-sharing, only using phone-grade GPS. Unlike classic visual localization or

recent PoseNet-like methods that may work well in indoor environments or

small-scale outdoor environments, we avoid using a map or an SFM

(structure-from-motion) model at all. More importantly, the proposed method can

be scaled up without concerns over the potential failure of 3D reconstruction.

A Complementary Tracking Model with Multiple Features

Peng Gao , Yipeng Ma , Ke Song , Chao Li , Fei Wang , Liyi Xiao Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Graphics (cs.GR)

Discriminative Correlation Filters (DCF)-based tracking algorithms exploiting

conventional handcrafted features have achieved impressive results both in

terms of accuracy and robustness. Template handcrafted features have shown

excellent performance, but they perform poorly when the appearance of target

changes rapidly such as fast motions and fast deformations. In contrast,

statistical handcrafted features are insensitive to fast states changes, but

they yield inferior performance in the scenarios of illumination variations and

background clutters. In this work, to achieve an efficient tracking

performance, we propose a novel visual tracking algorithm, named MFCMT, based

on a complementary ensemble model with multiple features, including Histogram

of Oriented Gradients (HOGs), Color Names (CNs) and Color Histograms (CHs).

Additionally, to improve tracking results and prevent targets drift, we

introduce an effective fusion method by exploiting relative entropy to coalesce

all basic response maps and get an optimal response. Furthermore, we suggest a

simple but efficient update strategy to boost tracking performance.

Comprehensive evaluations are conducted on two tracking benchmarks demonstrate

and the experimental results demonstrate that our method is competitive with

numerous state-of-the-art trackers. Our tracker achieves impressive performance

with faster speed on these benchmarks.

Generating a Fusion Image: One's Identity and Another's Shape

Donggyu Joo , Doyeon Kim , Junmo Kim

Comments: To appear in CVPR 2018

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Generating a novel image by manipulating two input images is an interesting

research problem in the study of generative adversarial networks (GANs). We

propose a new GAN-based network that generates a fusion image with the identity

of input image x and the shape of input image y. Our network can simultaneously

train on more than two image datasets in an unsupervised manner. We define an

identity loss LI to catch the identity of image x and a shape loss LS to get

the shape of y. In addition, we propose a novel training method called

Min-Patch training to focus the generator on crucial parts of an image, rather

than its entirety. We show qualitative results on the VGG Youtube Pose dataset,

Eye dataset (MPIIGaze and UnityEyes), and the Photo-Sketch-Cartoon dataset.

View Adaptive Neural Networks for High Performance Skeleton-based Human Action Recognition

Pengfei Zhang , Cuiling Lan , Junliang Xing , Wenjun Zeng , Jianru Xue , Nanning Zheng Subjects : Computer Vision and Pattern Recognition (cs.CV)

Skeleton-based human action recognition has recently attracted increasing

attention thanks to the accessibility and the popularity of 3D skeleton data.

One of the key challenges in skeleton-based action recognition lies in the

large view variations when capturing data. In order to alleviate the effects of

view variations, this paper introduces a novel view adaptation scheme, which

automatically determines the virtual observation viewpoints in a learning based

data driven manner. We design two view adaptive neural networks, i.e., VA-RNN

based on RNN, and VA-CNN based on CNN.. For each network, a novel view

adaptation module learns and determines the most suitable observation

viewpoints, and transforms the skeletons to those viewpoints for the end-to-end

recognition with a main classification network. Ablation studies find that the

proposed view adaptive models are capable of transforming the skeletons of

various viewpoints to much more consistent virtual viewpoints which largely

eliminates the viewpoint influence. In addition, we design a two-stream scheme

(referred to as VA-fusion) that fuses the scores of the two networks to provide

the fused prediction. Extensive experimental evaluations on five challenging

benchmarks demonstrate that the effectiveness of the proposed view-adaptive

networks and superior performance over state-of-the-art approaches.

Vision Meets Drones: A Challenge

Pengfei Zhu , Longyin Wen , Xiao Bian , Haibing Ling , Qinghua Hu

Comments: 11 pages, 11 figures

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

In this paper we present a large-scale visual object detection and tracking

benchmark, named VisDrone2018, aiming at advancing visual understanding tasks

on the drone platform. The images and video sequences in the benchmark were

captured over various urban/suburban areas of 14 different cities across China

from north to south. Specifically, VisDrone2018 consists of 263 video clips and

10,209 images (no overlap with video clips) with rich annotations, including

object bounding boxes, object categories, occlusion, truncation ratios, etc.

With intensive amount of effort, our benchmark has more than 2.5 million

annotated instances in 179,264 images/video frames. Being the largest such

dataset ever published, the benchmark enables extensive evaluation and

investigation of visual analysis algorithms on the drone platform. In

particular, we design four popular tasks with the benchmark, including object

detection in images, object detection in videos, single object tracking, and

multi-object tracking. All these tasks are extremely challenging in the

proposed dataset due to factors such as occlusion, large scale and pose

variation, and fast motion. We hope the benchmark largely boost the research

and development in visual analysis on drone platforms.

Calibration-free B0 correction of EPI data using structured low rank matrix recovery

Arvind Balachandrasekaran , Merry Mani , Mathews Jacob Subjects : Computer Vision and Pattern Recognition (cs.CV)

We introduce a structured low rank algorithm for the calibration-free

compensation of field inhomogeneity artifacts in Echo Planar Imaging (EPI) MRI

data. We acquire the data using two EPI readouts that differ in echo-time (TE).

Using time segmentation, we reformulate the field inhomogeneity compensation

problem as the recovery of an image time series from highly undersampled

Fourier measurements. The temporal profile at each pixel is modeled as a single

exponential, which is exploited to fill in the missing entries. We show that

the exponential behavior at each pixel, along with the spatial smoothness of

the exponential parameters, can be exploited to derive a 3D annihilation

relation in the Fourier domain. This relation translates to a low rank property

on a structured multi-fold Toeplitz matrix, whose entries correspond to the

measured k-space samples. We introduce a fast two-step algorithm for the

completion of the Toeplitz matrix from the available samples. In the first

step, we estimate the null space vectors of the Toeplitz matrix using only its

fully sampled rows. The null space is then used to estimate the signal

subspace, which facilitates the efficient recovery of the time series of

images. We finally demonstrate the proposed approach on spherical MR phantom

data and human data and show that the artifacts are significantly reduced. The

proposed approach could potentially be used to compensate for time varying

field map variations in dynamic applications such as functional MRI.

High Dynamic Range SLAM with Map-Aware Exposure Time Control

Sergey V. Alexandrov , Johann Prankl , Michael Zillich , Markus Vincze

Comments: 3DV 2017

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

The research in dense online 3D mapping is mostly focused on the geometrical

accuracy and spatial extent of the reconstructions. Their color appearance is

often neglected, leading to inconsistent colors and noticeable artifacts. We

rectify this by extending a state-of-the-art SLAM system to accumulate colors

in HDR space. We replace the simplistic pixel intensity averaging scheme with

HDR color fusion rules tailored to the incremental nature of SLAM and a noise

model suitable for off-the-shelf RGB-D cameras. Our main contribution is a

map-aware exposure time controller. It makes decisions based on the global

state of the map and predicted camera motion, attempting to maximize the

information gain of each observation. We report a set of experiments

demonstrating the improved texture quality and advantages of using the custom

controller that is tightly integrated in the mapping loop.

Survey of Face Detection on Low-quality Images

Yuqian Zhou , Ding Liu , Thomas Huang Subjects : Computer Vision and Pattern Recognition (cs.CV)

Face detection is a well-explored problem. Many challenges on face detectors

like extreme pose, illumination, low resolution and small scales are studied in

the previous work. However, previous proposed models are mostly trained and

tested on good-quality images which are not always the case for practical

applications like surveillance systems. In this paper, we first review the

current state-of-the-art face detectors and their performance on benchmark

dataset FDDB, and compare the design protocols of the algorithms. Secondly, we

investigate their performance degradation while testing on low-quality images

with different levels of blur, noise, and contrast. Our results demonstrate

that both hand-crafted and deep-learning based face detectors are not robust

enough for low-quality images. It inspires researchers to produce more robust

design for face detection in the wild.

Unsupervised Representation Adversarial Learning Network: from Reconstruction to Generation

Yuqian Zhou , Kuangxiao Gu , Thomas Huang Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Learning (cs.LG); Machine Learning (stat.ML)

A good representation for arbitrarily complicated data should have the

capability of semantic generation, clustering and reconstruction. Previous

research has already achieved impressive performance on either one. This paper

aims at learning a disentangled representation effective for all of them in an

unsupervised way. To achieve all the three tasks together, we learn the forward

and inverse mapping between data and representation on the basis of a symmetric

adversarial process. In theory, we minimize the upper bound of the two

conditional entropy loss between the latent variables and the observations

together to achieve the cycle consistency. The newly proposed RepGAN is tested

on MNIST, fashionMNIST, CelebA, and SVHN datasets to perform unsupervised or

semi-supervised classification, generation and reconstruction tasks. The result

demonstrates that RepGAN is able to learn a useful and competitive

representation. To the author’s knowledge, our work is the first one to achieve

both a high unsupervised classification accuracy and low reconstruction error

on MNIST.

Weakly Supervised Representation Learning for Unsynchronized Audio-Visual Events

Sanjeel Parekh , Slim Essid , Alexey Ozerov , Ngoc Q. K. Duong , Patrick Pérez , Gaël Richard Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Sound (cs.SD); Audio and Speech Processing (eess.AS)

Audio-visual representation learning is an important task from the

perspective of designing machines with the ability to understand complex

events. To this end, we propose a novel multimodal framework that instantiates

multiple instance learning. We show that the learnt representations are useful

for classifying events and localizing their characteristic audio-visual

elements. The system is trained using only video-level event labels without any

timing information. An important feature of our method is its capacity to learn

from unsynchronized audio-visual events. We achieve state-of-the-art results on

a large-scale dataset of weakly-labeled audio event videos. Visualizations of

localized visual regions and audio segments substantiate our system’s efficacy,

especially when dealing with noisy situations where modality-specific cues

appear asynchronously.

Super-resolution Ultrasound Localization Microscopy through Deep Learning

Ruud J.G. van Sloun , Oren Solomon , Matthew Bruce , Zin Z. Khaing , Hessel Wijkstra , Yonina C. Eldar , Massimo Mischi

Comments: 14 pages

Subjects

:

Signal Processing (eess.SP)

; Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

Ultrasound localization microscopy has enabled super-resolution vascular

imaging in laboratory environments through precise localization of individual

ultrasound contrast agents across numerous imaging frames. However, analysis of

high-density regions with significant overlaps among the agents’ point spread

responses yields high localization errors, constraining the technique to

low-concentration conditions. As such, long acquisition times are required to

sufficiently cover the vascular bed. In this work, we present a fast and

precise method for obtaining super-resolution vascular images from high-density

contrast-enhanced ultrasound imaging data. This method, which we term Deep

Ultrasound Localization Microscopy (Deep-ULM), exploits modern deep learning

strategies and employs a convolutional neural network to perform localization

microscopy in dense scenarios. This end-to-end fully convolutional neural

network architecture is trained effectively using on-line synthesized data,

enabling robust inference in-vivo under a wide variety of imaging conditions.

We show that deep learning attains super-resolution with challenging

contrast-agent concentrations (microbubble densities), both in-silico as well

as in-vivo, as we go from ultrasound scans of a rodent spinal cord in an

experimental setting to standard clinically-acquired recordings in a human

prostate. Deep-ULM achieves high quality sub-diffraction recovery, and is

suitable for real-time applications, resolving about 135 high-resolution

64×64-patches per second on a standard PC. Exploiting GPU computation, this

number increases to 2500 patches per second.

Analyzing Solar Irradiance Variation From GPS and Cameras

Shilpa Manandhar , Soumyabrata Dev , Yee Hui Lee , Yu Song Meng

Comments: Published in IEEE AP-S Symposium on Antennas and Propagation and USNC-URSI Radio Science Meeting, 2018

Subjects

:

Instrumentation and Methods for Astrophysics (astro-ph.IM)

; Computer Vision and Pattern Recognition (cs.CV)

The total amount of solar irradiance falling on the earth’s surface is an

important area of study amongst the photo-voltaic (PV) engineers and remote

sensing analysts. The received solar irradiance impacts the total amount of

generated solar energy. However, this generation is often hindered by the high

degree of solar irradiance variability. In this paper, we study the main

factors behind such variability with the assistance of Global Positioning

System (GPS) and ground-based, high-resolution sky cameras. This analysis will

also be helpful for understanding cloud phenomenon and other events in the

earth’s atmosphere.

Revisiting Small Batch Training for Deep Neural Networks

Dominic Masters , Carlo Luschi Subjects : Learning (cs.LG) ; Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

Modern deep neural network training is typically based on mini-batch

stochastic gradient optimization. While the use of large mini-batches increases

the available computational parallelism, small batch training has been shown to

provide improved generalization performance and allows a significantly smaller

memory footprint, which might also be exploited to improve machine throughput.

In this paper, we review common assumptions on learning rate scaling and

training duration, as a basis for an experimental comparison of test

performance for different mini-batch sizes. We adopt a learning rate that

corresponds to a constant average weight update per gradient calculation (i.e.,

per unit cost of computation), and point out that this results in a variance of

the weight updates that increases linearly with the mini-batch size (m).

The collected experimental results for the CIFAR-10, CIFAR-100 and ImageNet

datasets show that increasing the mini-batch size progressively reduces the

range of learning rates that provide stable convergence and acceptable test

performance. On the other hand, small mini-batch sizes provide more up-to-date

gradient calculations, which yields more stable and reliable training. The best

performance has been consistently obtained for mini-batch sizes between (m = 2)

and (m = 32), which contrasts with recent work advocating the use of mini-batch

sizes in the thousands.

Video based Contextual Question Answering

Akash Ganesan , Divyansh Pal , Karthik Muthuraman , Shubham Dash Subjects : Computation and Language (cs.CL) ; Computer Vision and Pattern Recognition (cs.CV)

The primary aim of this project is to build a contextual Question-Answering

model for videos. The current methodologies provide a robust model for image

based Question-Answering, but we are aim to generalize this approach to be

videos. We propose a graphical representation of video which is able to handle

several types of queries across the whole video. For example, if a frame has an

image of a man and a cat sitting, it should be able to handle queries like,

where is the cat sitting with respect to the man? or ,what is the man holding

in his hand?. It should be able to answer queries relating to temporal

relationships also.

Sampling-free Uncertainty Estimation in Gated Recurrent Units with Exponential Families

Seong Jae Hwang , Ronak Mehta , Vikas Singh

Comments: First version. Submitted to ECCV 2018

Subjects

:

Learning (cs.LG)

; Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

There has recently been a concerted effort to derive mechanisms in vision and

machine learning systems to offer uncertainty estimates of the predictions they

make. Clearly, there are enormous benefits to a system that is not only

accurate but also has a sense for when it is not sure. Existing proposals

center around Bayesian interpretations of modern deep architectures — these

are effective but can often be computationally demanding. We show how classical

ideas in the literature on exponential families on probabilistic networks

provide an excellent starting point to derive uncertainty estimates in Gated

Recurrent Units (GRU). Our proposal directly quantifies uncertainty

deterministically, without the need for costly sampling-based estimation. We

demonstrate how our model can be used to quantitatively and qualitatively

measure uncertainty in unsupervised image sequence prediction. To our

knowledge, this is the first result describing sampling-free uncertainty

estimation for powerful sequential models such as GRUs.

Artificial Intelligence

Delegating via Quitting Games

Juan Afanador , Nir Oren , Murilo S. Baptista Subjects : Artificial Intelligence (cs.AI) ; Multiagent Systems (cs.MA)

Delegation allows an agent to request that another agent completes a task. In

many situations the task may be delegated onwards, and this process can repeat

until it is eventually, successfully or unsuccessfully, performed. We consider

policies to guide an agent in choosing who to delegate to when such recursive

interactions are possible. These policies, based on quitting games and

multi-armed bandits, were empirically tested for effectiveness. Our results

indicate that the quitting game based policies outperform those which do not

explicitly account for the recursive nature of delegation.

Preference-Guided Planning: An Active Elicitation Approach

Mayukh Das , Phillip Odom , Md. Rakibul Islam , Janardhan Rao (Jana)

Doppa , Dan Roth , Sriraam Natarajan

Comments: Under Review at Knowledge-Based Systems (Elsevier); “Extended Abstract” accepted and to appear at AAMAS 2018

Subjects

:

Artificial Intelligence (cs.AI)

Planning with preferences has been employed extensively to quickly generate

high-quality plans. However, it may be difficult for the human expert to supply

this information without knowledge of the reasoning employed by the planner and

the distribution of planning problems. We consider the problem of actively

eliciting preferences from a human expert during the planning process.

Specifically, we study this problem in the context of the Hierarchical Task

Network (HTN) planning framework as it allows easy interaction with the human.

Our experimental results on several diverse planning domains show that the

preferences gathered using the proposed approach improve the quality and speed

of the planner, while reducing the burden on the human expert.

Cross-domain Dialogue Policy Transfer via Simultaneous Speech-act and Slot Alignment

Kaixiang Mo , Yu Zhang , Qiang Yang , Pascale Fung

Comments: v7

Subjects

:

Computation and Language (cs.CL)

; Artificial Intelligence (cs.AI)

Dialogue policy transfer enables us to build dialogue policies in a target

domain with little data by leveraging knowledge from a source domain with

plenty of data. Dialogue sentences are usually represented by speech-acts and

domain slots, and the dialogue policy transfer is usually achieved by assigning

a slot mapping matrix based on human heuristics. However, existing dialogue

policy transfer methods cannot transfer across dialogue domains with different

speech-acts, for example, between systems built by different companies. Also,

they depend on either common slots or slot entropy, which are not available

when the source and target slots are totally disjoint and no database is

available to calculate the slot entropy. To solve this problem, we propose a

Policy tRansfer across dOMaIns and SpEech-acts (PROMISE) model, which is able

to transfer dialogue policies across domains with different speech-acts and

disjoint slots. The PROMISE model can learn to align different speech-acts and

slots simultaneously, and it does not require common slots or the calculation

of the slot entropy. Experiments on both real-world dialogue data and

simulations demonstrate that PROMISE model can effectively transfer dialogue

policies across domains with different speech-acts and disjoint slots.

An Ensemble Generation MethodBased on Instance Hardness

Felipe N. Walmsley , George D. C. Cavalcanti , Dayvid V. R. Oliveira , Rafael M. O. Cruz , Robert Sabourin

Comments: Paper accepted for publication on IJCNN 2018

Subjects

:

Learning (cs.LG)

; Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

In Machine Learning, ensemble methods have been receiving a great deal of

attention. Techniques such as Bagging and Boosting have been successfully

applied to a variety of problems. Nevertheless, such techniques are still

susceptible to the effects of noise and outliers in the training data. We

propose a new method for the generation of pools of classifiers based on

Bagging, in which the probability of an instance being selected during the

resampling process is inversely proportional to its instance hardness, which

can be understood as the likelihood of an instance being misclassified,

regardless of the choice of classifier. The goal of the proposed method is to

remove noisy data without sacrificing the hard instances which are likely to be

found on class boundaries. We evaluate the performance of the method in

nineteen public data sets, and compare it to the performance of the Bagging and

Random Subspace algorithms. Our experiments show that in high noise scenarios

the accuracy of our method is significantly better than that of Bagging.

Stylistic Variation in Social Media Part-of-Speech Tagging

Murali Raghu Babu Balusu , Taha Merghani , Jacob Eisenstein

Comments: 9 pages, Published in Proceedings of NAACL workshop on stylistic variation (2018)

Subjects

:

Computation and Language (cs.CL)

; Artificial Intelligence (cs.AI)

Social media features substantial stylistic variation, raising new challenges

for syntactic analysis of online writing. However, this variation is often

aligned with author attributes such as age, gender, and geography, as well as

more readily-available social network metadata. In this paper, we report new

evidence on the link between language and social networks in the task of

part-of-speech tagging. We find that tagger error rates are correlated with

network structure, with high accuracy in some parts of the network, and lower

accuracy elsewhere. As a result, tagger accuracy depends on training from a

balanced sample of the network, rather than training on texts from a narrow

subcommunity. We also describe our attempts to add robustness to stylistic

variation, by building a mixture-of-experts model in which each expert is

associated with a region of the social network. While prior work found that

similar approaches yield performance improvements in sentiment analysis and

entity linking, we were unable to obtain performance improvements in

part-of-speech tagging, despite strong evidence for the link between

part-of-speech error rates and social network structure.

Information Retrieval

The Role-Relevance Model for Enhanced Semantic Targeting in Unstructured Text

Christopher A. George , Onur Ozdemir , Connie E. Fournelle , Kendra E. Moore

Comments: 10 pages, 3 figures, 6 tables, presented at SPIE Defense + Commercial Sensing: Next Generation Analyst (2018)

Subjects

:

Information Retrieval (cs.IR)

Personalized search provides a potentially powerful tool, however, it is

limited due to the large number of roles that a person has: parent, employee,

consumer, etc. We present the role-relevance algorithm: a search technique that

favors search results relevant to the user’s current role. The role-relevance

algorithm uses three factors to score documents: (1) the number of keywords

each document contains; (2) each document’s geographic relevance to the user’s

role (if applicable); and (3) each document’s topical relevance to the user’s

role (if applicable). Topical relevance is assessed using a novel extension to

Latent Dirichlet Allocation (LDA) that allows standard LDA to score document

relevance to user-defined topics. Overall results on a pre-labeled corpus show

an average improvement in search precision of approximately 20% compared to

keyword search alone.

twAwler: A lightweight twitter crawler

Polyvios Pratikakis

Comments: 8 pages, 7 figures, about to submit for review

Subjects

:

Social and Information Networks (cs.SI)

; Information Retrieval (cs.IR)

This paper presents twAwler, a lightweight twitter crawler that targets

language-specific communities of users. twAwler takes advantage of multiple

endpoints of the twitter API to explore user relations and quickly recognize

users belonging to the targetted set. It performs a complete crawl for all

users, discovering many standard user relations, including the retweet graph,

mention graph, reply graph, quote graph, follow graph, etc. twAwler respects

all twitter policies and rate limits, while able to monitor large communities

of active users.

twAwler was used between August 2016 and March 2018 to generate an extensive

dataset of close to all Greek-speaking twitter accounts (about 330 thousand)

and their tweets and relations. In total, the crawler has gathered 750 million

tweets of which 424 million are in Greek; 750 million follow relations;

information about 300 thousand lists, their members (119 million member

relations) and subscribers (27 thousand subscription relations); 705 thousand

trending topics; information on 52 million users in total of which 292 thousand

have been since suspended, 141 thousand have deleted their account, and 3.5

million are protected and cannot be crawled. twAwler mines the collected tweets

for the retweet, quote, reply, and mention graphs, which, in addition to the

follow relation crawled, offer vast opportunities for analysis and further

research.

The FactChecker: Verifying Text Summaries of Relational Data Sets

Saehan Jo , Immanuel Trummer , Weicheng Yu , Daniel Liu , Niyati Mehta

Comments: 13 pages, 11 figures, 6 tables

Subjects

:

Databases (cs.DB)

; Information Retrieval (cs.IR)

We present a novel natural language query interface, the FactChecker, aimed

at text summaries of relational data sets. The tool focuses on natural language

claims that translate into an SQL query and a claimed query result. Similar in

spirit to a spell checker, the FactChecker marks up text passages that seem to

be inconsistent with the actual data. At the heart of the system is a

probabilistic model that reasons about the input document in a holistic

fashion. Based on claim keywords and the document structure, it maps each text

claim to a probability distribution over associated query translations. By

efficiently executing tens to hundreds of thousands of candidate translations

for a typical input document, the system maps text claims to correctness

probabilities. This process becomes practical via a specialized processing

backend, avoiding redundant work via query merging and result caching.

Verification is an interactive process in which users are shown tentative

results, enabling them to take corrective actions if necessary.

Our system was tested on a set of 53 public articles containing 392 claims.

Our test cases include articles from major newspapers, summaries of survey

results, and Wikipedia articles. Our tool revealed erroneous claims in roughly

a third of test cases. A detailed user study shows that users using our tool

are in average six times faster at checking text summaries, compared to generic

SQL interfaces. In fully automated verification, our tool achieves

significantly higher recall and precision than baselines from the areas of

natural language query interfaces and fact checking.

Approaches for Enriching and Improving Textual Knowledge Bases

Besnik Fetahu

Comments: PhD thesis, 2017

Subjects

:

Computation and Language (cs.CL)

; Information Retrieval (cs.IR)

Verifiability is one of the core editing principles in Wikipedia, where

editors are encouraged to provide citations for the added statements.

Statements can be any arbitrary piece of text, ranging from a sentence up to a

paragraph. However, in many cases, citations are either outdated, missing, or

link to non-existing references (e.g. dead URL, moved content etc.). In total,

20/% of the cases such citations refer to news articles and represent the

second most cited source. Even in cases where citations are provided, there are

no explicit indicators for the span of a citation for a given piece of text. In

addition to issues related with the verifiability principle, many Wikipedia

entity pages are incomplete, with relevant information that is already

available in online news sources missing. Even for the already existing

citations, there is often a delay between the news publication time and the

reference time.

In this thesis, we address the aforementioned issues and propose automated

approaches that enforce the verifiability principle in Wikipedia, and suggest

relevant and missing news references for further enriching Wikipedia entity

pages.

Benchmarking Top-K Keyword and Top-K Document Processing with T({}^2)K({}^2) and T({}^2)K({}^2)D({}^2)

Ciprian-Octavian Truica (UPB), Jérôme Darmont (ERIC), Alexandru Boicea (UPB), Florin Radulescu (UPB)

Journal-ref: Future Generation Computer Systems, Elsevier, 2018, 85, pp.60-75.

https://www.sciencedirect.com/science/article/pii/S0167739X17323580

Subjects

:

Databases (cs.DB)

; Information Retrieval (cs.IR)

Top-k keyword and top-k document extraction are very popular text analysis

techniques. Top-k keywords and documents are often computed on-the-fly, but

they exploit weighted vocabularies that are costly to build. To compare

competing weighting schemes and database implementations, benchmarking is

customary. To the best of our knowledge, no benchmark currently addresses these

problems. Hence, in this paper, we present T({}^2)K({}^2), a top-k keywords and

documents benchmark, and its decision support-oriented evolution

T({}^2)K({}^2)D({}^2). Both benchmarks feature a real tweet dataset and queries

with various complexities and selectivities. They help evaluate weighting

schemes and database implementations in terms of computing performance. To

illustrate our bench-marks’ relevance and genericity, we successfully ran

performance tests on the TF-IDF and Okapi BM25 weighting schemes, on one hand,

and on different relational (Oracle, PostgreSQL) and document-oriented

(MongoDB) database implementations, on the other hand.

Computation and Language

Phrase-Based & Neural Unsupervised Machine Translation

Guillaume Lample , Myle Ott , Alexis Conneau , Ludovic Denoyer , Marc'Aurelio Ranzato Subjects : Computation and Language (cs.CL)

Machine translation systems achieve near human-level performance on some

languages, yet their effectiveness strongly relies on the availability of large

amounts of bitexts, which hinders their applicability to the majority of

language pairs. This work investigates how to learn to translate when having

access to only large monolingual corpora in each language. We propose two model

variants, a neural and a phrase-based model. Both versions leverage automatic

generation of parallel data by backtranslating with a backward model operating

in the other direction, and the denoising effect of a language model trained on

the target side. These models are significantly better than methods from the

literature, while being simpler and having fewer hyper-parameters. On the

widely used WMT14 English-French and WMT16 German-English benchmarks, our

models respectively obtain 27.1 and 23.6 BLEU points without using a single

parallel sentence, outperforming the state of the art by more than 11 BLEU

points.

Learning Semantic Textual Similarity from Conversations

Yinfei Yang , Steve Yuan , Daniel Cer , Sheng-yi Kong , Noah Constant , Petr Pilar , Heming Ge , Yun-Hsuan Sung , Brian Strope , Ray Kurzweil

Comments: 10 pages, 8 Figures, 6 Tables

Subjects

:

Computation and Language (cs.CL)

We present a novel approach to learn representations for sentence-level

semantic similarity using conversational data. Our method trains an

unsupervised model to predict conversational input-response pairs. The

resulting sentence embeddings perform well on the semantic textual similarity

(STS) benchmark and SemEval 2017’s Community Question Answering (CQA) question

similarity subtask. Performance is further improved by introducing multitask

training combining the conversational input-response prediction task and a

natural language inference task. Extensive experiments show the proposed model

achieves the best performance among all neural models on the STS benchmark and

is competitive with the state-of-the-art feature engineered and mixed systems

in both tasks.

Improving Supervised Bilingual Mapping of Word Embeddings

Armand Joulin , Piotr Bojanowski , Tomas Mikolov , Edouard Grave Subjects : Computation and Language (cs.CL) ; Learning (cs.LG)

Continuous word representations, learned on different languages, can be

aligned with remarkable precision. Using a small bilingual lexicon as training

data, learning the linear transformation is often formulated as a regression

problem using the square loss. The obtained mapping is known to suffer from the

hubness problem, when used for retrieval tasks (e.g. for word translation). To

address this issue, we propose to use a retrieval criterion instead of the

square loss for learning the mapping. We evaluate our method on word

translation, showing that our loss function leads to state-of-the-art results,

with the biggest improvements observed for distant language pairs such as

English-Chinese.

Phrase-Indexed Question Answering: A New Challenge for Scalable Document Comprehension

Minjoon Seo , Tom Kwiatkowski , Ankur P. Parikh , Ali Farhadi , Hannaneh Hajishirzi

Comments: 6 pages

Subjects

:

Computation and Language (cs.CL)

The current trend of extractive question answering (QA) heavily relies on the

joint encoding of the document and the question. In this paper, we formalize a

new modular variant of extractive QA, Phrase-Indexed Question Answering

(PI-QA), that enforces complete independence of the document encoder from the

question by building the standalone representation of the document discourse, a

key research goal in machine reading comprehension. That is, the document

encoder generates an index vector for each answer candidate phrase in the

document; at inference time, each question is mapped to the same vector space

and the answer with the nearest index vector is obtained. The formulation also

implies a significant scalability advantage since the index vectors can be

pre-computed and hashed offline for efficient retrieval. We experiment with

baseline models for the new task, which achieve a reasonable accuracy but

significantly underperform unconstrained QA models. We invite the QA research

community to engage in PI-QA for closing the gap.

Generating syntactically varied realisations from AMR graphs

Kris Cao , Stephen Clark Subjects : Computation and Language (cs.CL)

Generating from Abstract Meaning Representation (AMR) is an underspecified

problem, as many syntactic decisions are not specified by the semantic graph.

We learn a sequence-to-sequence model that generates possible constituency

trees for an AMR graph, and then train another model to generate text

realisations conditioned on both an AMR graph and a constituency tree. We show

that factorising the model this way lets us effectively use parse information,

obtaining competitive BLEU scores on self-generated parses and impressive BLEU

scores with oracle parses. We also demonstrate that we can generate

meaning-preserving syntactic paraphrases of the same AMR graph.

Lightweight Adaptive Mixture of Neural and N-gram Language Models

Anton Bakhtin , Arthur Szlam , Marc'Aurelio Ranzato , Edouard Grave Subjects : Computation and Language (cs.CL)

It is often the case that the best performing language model is an ensemble

of a neural language model with n-grams. In this work, we propose a method to

improve how these two models are combined. By using a small network which

predicts the mixture weight between the two models, we adapt their relative

importance at each time step. Because the gating network is small, it trains

quickly on small amounts of held out data, and does not add overhead at scoring

time. Our experiments carried out on the One Billion Word benchmark show a

significant improvement over the state of the art ensemble without retraining

of the basic modules.

Cross-domain Dialogue Policy Transfer via Simultaneous Speech-act and Slot Alignment

Kaixiang Mo , Yu Zhang , Qiang Yang , Pascale Fung

Comments: v7

Subjects

:

Computation and Language (cs.CL)

; Artificial Intelligence (cs.AI)

Dialogue policy transfer enables us to build dialogue policies in a target

domain with little data by leveraging knowledge from a source domain with

plenty of data. Dialogue sentences are usually represented by speech-acts and

domain slots, and the dialogue policy transfer is usually achieved by assigning

a slot mapping matrix based on human heuristics. However, existing dialogue

policy transfer methods cannot transfer across dialogue domains with different

speech-acts, for example, between systems built by different companies. Also,

they depend on either common slots or slot entropy, which are not available

when the source and target slots are totally disjoint and no database is

available to calculate the slot entropy. To solve this problem, we propose a

Policy tRansfer across dOMaIns and SpEech-acts (PROMISE) model, which is able

to transfer dialogue policies across domains with different speech-acts and

disjoint slots. The PROMISE model can learn to align different speech-acts and

slots simultaneously, and it does not require common slots or the calculation

of the slot entropy. Experiments on both real-world dialogue data and

simulations demonstrate that PROMISE model can effectively transfer dialogue

policies across domains with different speech-acts and disjoint slots.

Acquisition of Phrase Correspondences using Natural Deduction Proofs

Hitomi Yanaka , Koji Mineshima , Pascual Martinez-Gomez , Daisuke Bekki

Comments: 11 pages, 4 figures, accepted as long paper of NAACL HLT 2018

Subjects

:

Computation and Language (cs.CL)

How to identify, extract, and use phrasal knowledge is a crucial problem for

the task of Recognizing Textual Entailment (RTE). To solve this problem, we

propose a method for detecting paraphrases via natural deduction proofs of

semantic relations between sentence pairs. Our solution relies on a graph

reformulation of partial variable unifications and an algorithm that induces

subgraph alignments between meaning representations. Experiments show that our

method can automatically detect various paraphrases that are absent from

existing paraphrase databases. In addition, the detection of paraphrases using

proof information improves the accuracy of RTE tasks.

ClaimRank: Detecting Check-Worthy Claims in Arabic and English

Israa Jaradat , Pepa Gencheva , Alberto Barron-Cedeno , Lluis Marquez , Preslav Nakov

Comments: Check-worthiness; Fact-Checking; Veracity; Community-Question Answering; Neural Networks; Arabic; English

Journal-ref: NAACL-2018

Subjects

:

Computation and Language (cs.CL)

We present ClaimRank, an online system for detecting check-worthy claims.

While originally trained on political debates, the system can work for any kind

of text, e.g., interviews or regular news articles. Its aim is to facilitate

manual fact-checking efforts by prioritizing the claims that fact-checkers

should consider first. ClaimRank supports both Arabic and English, it is

trained on actual annotations from nine reputable fact-checking organizations

(PolitiFact, FactCheck, ABC, CNN, NPR, NYT, Chicago Tribune, The Guardian, and

Washington Post), and thus it can mimic the claim selection strategies for each

and any of them, as well as for the union of them all.

Approaches for Enriching and Improving Textual Knowledge Bases

Besnik Fetahu

Comments: PhD thesis, 2017

Subjects

:

Computation and Language (cs.CL)

; Information Retrieval (cs.IR)

Verifiability is one of the core editing principles in Wikipedia, where

editors are encouraged to provide citations for the added statements.

Statements can be any arbitrary piece of text, ranging from a sentence up to a

paragraph. However, in many cases, citations are either outdated, missing, or

link to non-existing references (e.g. dead URL, moved content etc.). In total,

20/% of the cases such citations refer to news articles and represent the

second most cited source. Even in cases where citations are provided, there are

no explicit indicators for the span of a citation for a given piece of text. In

addition to issues related with the verifiability principle, many Wikipedia

entity pages are incomplete, with relevant information that is already

available in online news sources missing. Even for the already existing

citations, there is often a delay between the news publication time and the

reference time.

In this thesis, we address the aforementioned issues and propose automated

approaches that enforce the verifiability principle in Wikipedia, and suggest

relevant and missing news references for further enriching Wikipedia entity

pages.

Automatic Stance Detection Using End-to-End Memory Networks

Mitra Mohtarami , Ramy Baly , James Glass , Preslav Nakov , Lluis Marquez , Alessandro Moschitti

Comments: NAACL-2018; Stance detection; Fact-Checking; Veracity; Memory networks; Neural Networks; Distributed Representations

Subjects

:

Computation and Language (cs.CL)

We present a novel end-to-end memory network for stance detection, which

jointly (i) predicts whether a document agrees, disagrees, discusses or is

unrelated with respect to a given target claim, and also (ii) extracts snippets

of evidence for that prediction. The network operates at the paragraph level

and integrates convolutional and recurrent neural networks, as well as a

similarity matrix as part of the overall architecture. The experimental

evaluation on the Fake News Challenge dataset shows state-of-the-art

performance.

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

Alex Wang , Amapreet Singh , Julian Michael , Felix Hill , Omer Levy , Samuel R. Bowman

Comments: this https URL

Subjects

:

Computation and Language (cs.CL)

For natural language understanding (NLU) technology to be maximally useful,

both practically and as a scientific object of study, it must be general: it

must be able to process language in a way that is not exclusively tailored to

any one specific task or dataset. In pursuit of this objective, we introduce

the General Language Understanding Evaluation benchmark (GLUE), a tool for

evaluating and analyzing the performance of models across a diverse range of

existing NLU tasks. GLUE is model-agnostic, but it incentivizes sharing

knowledge across tasks because certain tasks have very limited training data.

We further provide a hand-crafted diagnostic test suite that enables detailed

linguistic analysis of NLU models. We evaluate baselines based on current

methods for multi-task and transfer learning and find that they do not

immediately give substantial improvements over the aggregate performance of

training a separate model per task, indicating room for improvement in

developing general and robust NLU systems.

Sentence Simplification with Memory-Augmented Neural Networks

Tu Vu , Baotian Hu , Tsendsuren Munkhdalai , Hong Yu

Comments: Accepted as a conference paper at NAACL HLT 2018

Subjects

:

Computation and Language (cs.CL)

Sentence simplification aims to simplify the content and structure of complex

sentences, and thus make them easier to interpret for human readers, and easier

to process for downstream NLP applications. Recent advances in neural machine

translation have paved the way for novel approaches to the task. In this paper,

we adapt an architecture with augmented memory capacities called Neural

Semantic Encoders (Munkhdalai and Yu, 2017) for sentence simplification. Our

experiments demonstrate the effectiveness of our approach on different

simplification datasets, both in terms of automatic evaluation measures and

human judgments.

Video based Contextual Question Answering

Akash Ganesan , Divyansh Pal , Karthik Muthuraman , Shubham Dash Subjects : Computation and Language (cs.CL) ; Computer Vision and Pattern Recognition (cs.CV)

The primary aim of this project is to build a contextual Question-Answering

model for videos. The current methodologies provide a robust model for image

based Question-Answering, but we are aim to generalize this approach to be

videos. We propose a graphical representation of video which is able to handle

several types of queries across the whole video. For example, if a frame has an

image of a man and a cat sitting, it should be able to handle queries like,

where is the cat sitting with respect to the man? or ,what is the man holding

in his hand?. It should be able to answer queries relating to temporal

relationships also.

A Predictive Model for Notional Anaphora in English

Amir Zeldes

Comments: NAACL 2018 Workshop on Computational Models of Reference, Anaphora, and Coreference (CRAC). New Orleans, LA

Subjects

:

Computation and Language (cs.CL)

Notional anaphors are pronouns which disagree with their antecedents’

grammatical categories for notional reasons, such as plural to singular

agreement in: ‘the government … they’. Since such cases are rare and conflict

with evidence from strictly agreeing cases (‘the government … it’), they

present a substantial challenge to both coreference resolution and referring

expression generation. Using the OntoNotes corpus, this paper takes an ensemble

approach to predicting English notional anaphora in context on the basis of the

largest empirical data to date. In addition to state of the art prediction

accuracy, the results suggest that theoretical approaches positing a plural

construal at the antecedent’s utterance are insufficient, and that

circumstances at the anaphor’s utterance location, as well as global factors

such as genre, have a strong effect on the choice of referring expression.

Stylistic Variation in Social Media Part-of-Speech Tagging

Murali Raghu Babu Balusu , Taha Merghani , Jacob Eisenstein

Comments: 9 pages, Published in Proceedings of NAACL workshop on stylistic variation (2018)

Subjects

:

Computation and Language (cs.CL)

; Artificial Intelligence (cs.AI)

Social media features substantial stylistic variation, raising new challenges

for syntactic analysis of online writing. However, this variation is often

aligned with author attributes such as age, gender, and geography, as well as

more readily-available social network metadata. In this paper, we report new

evidence on the link between language and social networks in the task of

part-of-speech tagging. We find that tagger error rates are correlated with

network structure, with high accuracy in some parts of the network, and lower

accuracy elsewhere. As a result, tagger accuracy depends on training from a

balanced sample of the network, rather than training on texts from a narrow

subcommunity. We also describe our attempts to add robustness to stylistic

variation, by building a mixture-of-experts model in which each expert is

associated with a region of the social network. While prior work found that

similar approaches yield performance improvements in sentiment analysis and

entity linking, we were unable to obtain performance improvements in

part-of-speech tagging, despite strong evidence for the link between

part-of-speech error rates and social network structure.

Assessing Language Proficiency from Eye Movements in Reading

Yevgeni Berzak , Boris Katz , Roger Levy

Comments: NAACL 2018

Subjects

:

Computation and Language (cs.CL)

We present a novel approach for determining learners’ second language

proficiency which utilizes behavioral traces of eye movements during reading.

Our approach provides stand-alone eyetracking based English proficiency scores

which reflect the extent to which the learner’s gaze patterns in reading are

similar to those of native English speakers. We show that our scores correlate

strongly with standardized English proficiency tests. We also demonstrate that

gaze information can be used to accurately predict the outcomes of such tests.

Our approach yields the strongest performance when the test taker is presented

with a suite of sentences for which we have eyetracking data from other

readers. However, it remains effective even using eyetracking with sentences

for which eye movement data have not been previously collected. By deriving

proficiency as an automatic byproduct of eye movements during ordinary reading,

our approach offers a potentially valuable new tool for second language

proficiency assessment. More broadly, our results open the door to future

methods for inferring reader characteristics from the behavioral traces of

reading.

Distributed, Parallel, and Cluster Computing

Cut to Fit: Tailoring the Partitioning to the Computation

Iacovos Kolokasis , Polyvios Pratikakis

Comments: 14 pages, 6 figures, has been submitted for review

Subjects

:

Distributed, Parallel, and Cluster Computing (cs.DC)

Social Graph Analytics applications are very often built using off-the-shelf

analytics frameworks. These, however, are profiled and optimized for the

general case and have to perform for all kinds of graphs. This paper

investigates how knowledge of the application and the dataset can help optimize

performance with minimal effort. We concentrate on the impact of partitioning

strategies on the performance of computations on social graphs. We evaluate six

graph partitioning algorithms on a set of six social graphs, using four

standard graph algorithms by measuring a set of five partitioning metrics.

We analyze the performance of each partitioning strategy with respect to (i)

the properties of the graph dataset, (ii) each analytics computation,of

partitions. We discover that communication cost is the best predictor of

performance for most -but not all- analytics computations. We also find that

the best partitioning strategy for a particular kind of algorithm may not be

the best for another, and that optimizing for the general case of all

algorithms may not select the optimal partitioning strategy for a given graph

algorithm. We conclude with insights on selecting the right data partitioning

strategy, which has significant impact on the performance of large graph

analytics computations; certainly enough to warrant optimization of the

partitioning strategy to the computation and to the dataset.

CUDA Support in GNA Data Analysis Framework

Anna Fatkina , Maxim Gonchar , Liudmila Kolupaeva , Dmitry Naumov , Konstantin Treskov

Comments: 12 pages, 7 figures, ICCSA 2018, submitted to Lecture Notes in Computer Science (Springer Verlag)

Subjects

:

Distributed, Parallel, and Cluster Computing (cs.DC)

; Computational Engineering, Finance, and Science (cs.CE)

Usage of GPUs as co-processors is a well-established approach to accelerate

costly algorithms operating on matrices and vectors.

We aim to further improve the performance of the Global Neutrino Analysis

framework (GNA) by adding GPU support in a way that is transparent to the end

user. To achieve our goal we use CUDA, a state of the art technology providing

GPGPU programming methods.

In this paper we describe new features of GNA related to CUDA support. Some

specific framework features that influence GPGPU integration are also

explained. The paper investigates the feasibility of GPU technology application

and shows an example of the achieved acceleration of an algorithm implemented

within framework. Benchmarks show a significant performance increase when using

GPU transformations.

The project is currently in the developmental phase. Our plans include

implementation of the set of transformations necessary for the data analysis in

the GNA framework and tests of the GPU expediency in the complete analysis

chain.

Estimating Latencies of Task Sequences in Multi-Core Automotive ECUs

Max J. Friese , Thorsten Ehlers , Dirk Nowotka Subjects : Distributed, Parallel, and Cluster Computing (cs.DC)

The computation of a cyber-physical system’s reaction to a stimulus typically

involves the execution of several tasks. The delay between stimulus and

reaction thus depends on the interaction of these tasks and is subject to

timing constraints. Such constraints exist for a number of reasons and range

from possible impacts on customer experiences to safety requirements. We

present a technique to determine end-to-end latencies of such task sequences.

The technique is demonstrated on the example of electronic control units (ECUs)

in automotive embedded real-time systems. Our approach is able to deal with

multi-core architectures and supports four different activation patterns,

including interrupts. It is the first formal analysis approach making use of

load assumptions in order to exclude infeasible data propagation paths without

the knowledge of worst-case execution times or worst-case response times. We

employ a constraint programming solver to compute bounds on end-to-end

latencies.

OpenFPM: A scalable open framework for particle and particle-mesh codes on parallel computers

Pietro Incardona , Antonio Leo , Yaroslav Zaluzhnyi , Rajesh Ramaswamy , Ivo F. Sbalzarini

Comments: 32 pages, 12 figures

Subjects

:

Distributed, Parallel, and Cluster Computing (cs.DC)

; Mathematical Software (cs.MS); Software Engineering (cs.SE); Computational Physics (physics.comp-ph)

Scalable and efficient numerical simulations continue to gain importance, as

computation is firmly established as the third pillar of discovery, alongside

theory and experiment. Meanwhile, the performance of computing hardware grows

through increasing heterogeneous parallelism, enabling simulations of ever more

complex models. However, efficiently implementing scalable codes on

heterogeneous, distributed hardware systems becomes the bottleneck. This

bottleneck can be alleviated by intermediate software layers that provide

higher-level abstractions closer to the problem domain, hence allowing the

computational scientist to focus on the simulation. Here, we present OpenFPM,

an open and scalable framework that provides an abstraction layer for numerical

simulations using particles and/or meshes. OpenFPM provides transparent and

scalable infrastructure for shared-memory and distributed-memory

implementations of particles-only and hybrid particle-mesh simulations of both

discrete and continuous models, as well as non-simulation codes. This

infrastructure is complemented with portable implementations of frequently used

numerical routines, as well as interfaces to third-party libraries. We present

the architecture and design of OpenFPM, detail the underlying abstractions, and

benchmark the framework in applications ranging from Smoothed-Particle

Hydrodynamics (SPH) to Molecular Dynamics (MD), Discrete Element Methods (DEM),

Vortex Methods, stencil codes, high-dimensional Monte Carlo sampling (CMA-ES),

and Reaction-Diffusion solvers, comparing it to the current state of the art

and existing software frameworks.

The Power of Machine Learning and Market Design for Cloud Computing Admission Control

Ludwig Dierks , Ian Kash , Sven Seuken Subjects : Distributed, Parallel, and Cluster Computing (cs.DC) ; Performance (cs.PF)

Cloud computing providers must handle customer workloads that wish to scale

their use of resources such as virtual machines up and down over time.

Currently, this is often done using simple threshold policies to reserve large

parts of each cluster. This leads to low utilization of the cluster on average.

In this paper, we propose more sophisticated policies for controlling admission

to a cluster and demonstrate that our policies significantly increase cluster

utilization. We first introduce a model and fit its parameters on a data trace

from Microsoft Azure. We then design policies that estimate moments of each

workload’s distribution of future resource usage. Via simulations we show that,

while estimating the first moments of workloads leads to a substantial

improvement over the simple threshold policy, also taking the second moments

into account yields another improvement in utilization. We then evaluate how

much further this can be improved with learned or elicited prior information

and how to incentivize users to provide this information.

Parallel Quicksort without Pairwise Element Exchange

Jesper Larsson Träff Subjects : Distributed, Parallel, and Cluster Computing (cs.DC)

Standard implementations of 2-way, parallel, distributed memory Quicksort

algorithms exchange partitioned data elements at each level of the recursion.

This is not necessary: It suffices to exchange only the chosen pivots, while

postponing element redistribution to the bottom of the recursion. This reduces

the total volume of data exchanged from (O(nlog p)) to (O(n)), (n) being the

total number of elements to be sorted and (p) a power-of-two number of

processors, while preserving the flavor, characteristics and properties of a

Quicksort implementation. We give a template implementation based on this

observation, and compare against a standard, 2-way parallel Quicksort

implementation as well as other recent Quicksort implementations. We show

substantial, and considerably better absolute speed-up on a medium-large

InfiniBand cluster.

Challenges and pitfalls of partitioning blockchains

Enrique Fynn , Fernando Pedone Subjects : Distributed, Parallel, and Cluster Computing (cs.DC)

Blockchain has received much attention in recent years. This immense

popularity has raised a number of concerns, scalability of blockchain systems

being a common one. In this paper, we seek to understand how Ethereum, a

well-established blockchain system, would respond to sharding. Sharding is a

prevalent technique to increase the scalability of distributed systems. To

understand how sharding would affect Ethereum, we model Ethereum blockchain as

a graph and evaluate five methods to partition the graph. We analyze the

results using three metrics: the balance among shards, the number of

transactions that would involve multiple shards, and the amount of data that

would be relocated across shards upon a repartitioning of the system.

Analyzing astronomical data with Apache Spark

Julien Peloton , Christian Arnault , Stéphane Plaszczynski

Comments: 9 pages, 6 figures. Package available at this https URL

Subjects

:

Instrumentation and Methods for Astrophysics (astro-ph.IM)

; Distributed, Parallel, and Cluster Computing (cs.DC)

We investigate the performances of Apache Spark, a cluster computing

framework, for analyzing data from future LSST-like galaxy surveys. Apache

Spark attempts to address big data problems have hitherto proved successful in

the industry, but its main use is often limited to naively structured data. We

show how to manage more complex binary data structures such as those handled in

astrophysics experiments, within a distributed environment. To this purpose, we

first designed and implemented a Spark connector to handle sets of arbitrarily

large FITS files, called spark-fits. The user interface is such that a simple

file “drag-and-drop” to a cluster gives full advantage of the framework. We

demonstrate the very high scalability of spark-fits using the LSST fast

simulation tool, CoLoRe, and present the methodologies for measuring and tuning

the performance bottlenecks for the workloads, scaling up to terabytes of FITS

data on the Cloud@VirtualData, located at Universit’e Paris Sud. We also

evaluate its performance on Cori, a High-Performance Computing system located

at NERSC, and widely used in the scientific community.

Identity Aging: Efficient Blockchain Consensus

Mansoor Ahmed , Kari Kostiainen Subjects : Cryptography and Security (cs.CR) ; Distributed, Parallel, and Cluster Computing (cs.DC)

Decentralized currencies and similar blockchain applications require

consensus. Bitcoin achieves eventual consensus in a fully-decentralized

setting, but provides very low throughput and high latency with excessive

energy consumption. In this paper, we propose identity aging as a novel and

more efficient consensus approach. Our main idea is to establish reliable,

long-term identities and choose the oldest identity as the miner on each round.

Based on this approach, we design two blockchain systems. Our first system,

SCIFER, leverages Intel’s SGX attestation for identity bootstrapping in a

partially-decentralized setting, where blockchain is permissionless, but we

trust Intel for attestation. Our second system, DIFER, creates new identities

through a novel mining mechanism and provides consensus in a

fully-decentralized setting, similar to Bitcoin. One of the main benefits of

identity aging is that it does not require constant computation. Our analysis

and experiments show that identity aging provides significant performance

improvements over Bitcoin with strong security guarantees.

Learning

Modelling customer online behaviours with neural networks: applications to conversion prediction and advertising retargeting

Yanwei Cui , Rogatien Tobossi , Olivia Vigouroux Subjects : Learning (cs.LG) ; Machine Learning (stat.ML)

In this paper, we apply neural networks into digital marketing world for the

purpose of better targeting the potential customers. To do so, we model the

customer online behaviours using dedicated neural network architectures.

Starting from user searched keywords in a search engine to the landing page and

different following pages, until the user left the site, we model the whole

visited journey with a Recurrent Neural Network (RNN), together with

Convolution Neural Networks (CNN) that can take into account of the semantic

meaning of user searched keywords and different visited page names. With such

model, we use Monte Carlo simulation to estimate the conversion rates of each

potential customer in the future visiting. We believe our concept and the

preliminary promising results in this paper enable the use of largely available

customer online behaviours data for advanced digital marketing analysis.

Revisiting Small Batch Training for Deep Neural Networks

Dominic Masters , Carlo Luschi Subjects : Learning (cs.LG) ; Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

Modern deep neural network training is typically based on mini-batch

stochastic gradient optimization. While the use of large mini-batches increases

the available computational parallelism, small batch training has been shown to

provide improved generalization performance and allows a significantly smaller

memory footprint, which might also be exploited to improve machine throughput.

In this paper, we review common assumptions on learning rate scaling and

training duration, as a basis for an experimental comparison of test

performance for different mini-batch sizes. We adopt a learning rate that

corresponds to a constant average weight update per gradient calculation (i.e.,

per unit cost of computation), and point out that this results in a variance of

the weight updates that increases linearly with the mini-batch size (m).

The collected experimental results for the CIFAR-10, CIFAR-100 and ImageNet

datasets show that increasing the mini-batch size progressively reduces the

range of learning rates that provide stable convergence and acceptable test

performance. On the other hand, small mini-batch sizes provide more up-to-date

gradient calculations, which yields more stable and reliable training. The best

performance has been consistently obtained for mini-batch sizes between (m = 2)

and (m = 32), which contrasts with recent work advocating the use of mini-batch

sizes in the thousands.

Robust and scalable learning of data manifolds with complex topologies via ElPiGraph

Luca Albergante , Evgeny M. Mirkes , Huidong Chen , Alexis Martin , Louis Faure , Emmanuel Barillot , Luca Pinello , Alexander N. Gorban , Andrei Zinovyev

Comments: 23 pages, 9 figures

Subjects

:

Learning (cs.LG)

; Quantitative Methods (q-bio.QM); Machine Learning (stat.ML)

We present ElPiGraph, a method for approximating data distributions having

non-trivial topological features such as the existence of excluded regions or

branching structures. Unlike many existing methods, ElPiGraph is not based on

the construction of a k-nearest neighbour graph, a procedure that can perform

poorly in the case of multidimensional and noisy data. Instead, ElPiGraph

constructs elastic principal graphs in a more robust way by minimizing elastic

energy, applying graph grammars and explicitly controlling topological

complexity. Using trimmed approximation error function makes ElPiGraph

extremely robust to the presence of background noise without decreasing

computational performance and allows it to deal with complex cases of manifold

learning (for example, ElPiGraph can learn disconnected intersecting

manifolds). Thanks to the quasi-quadratic nature of the elastic function,

ElPiGraph performs almost as fast as a simple k-means clustering and,

therefore, is much more scalable than alternative methods, and can work on

large datasets containing millions of high dimensional points on a personal

computer. The excellent performance of the method opens the possibility to

apply resampling and to approximate complex data structures via principal graph

ensembles which can be used to construct consensus principal graphs. ElPiGraph

is currently implemented in five programming languages and accompanied by a

graphical user interface, which makes it a versatile tool to deal with complex

data in various fields from molecular biology, where it can be used to infer

pseudo-time trajectories from single-cell RNASeq, to astronomy, where it can be

used to approximate complex structures in the distribution of galaxies.

Streaming Active Learning Strategies for Real-Life Credit Card Fraud Detection: Assessment and Visualization

Fabirzio Carcillo , Yann-Aël Le Borgne , Olivier Caelen , Gianluca Bontempi

Journal-ref: International Journal of Data Science and Analytics 2018

Subjects

:

Learning (cs.LG)

; Machine Learning (stat.ML)

Credit card fraud detection is a very challenging problem because of the

specific nature of transaction data and the labeling process. The transaction

data is peculiar because they are obtained in a streaming fashion, they are

strongly imbalanced and prone to non-stationarity. The labeling is the outcome

of an active learning process, as every day human investigators contact only a

small number of cardholders (associated to the riskiest transactions) and

obtain the class (fraud or genuine) of the related transactions. An adequate

selection of the set of cardholders is therefore crucial for an efficient fraud

detection process. In this paper, we present a number of active learning

strategies and we investigate their fraud detection accuracies. We compare

different criteria (supervised, semi-supervised and unsupervised) to query

unlabeled transactions. Finally, we highlight the existence of an

exploitation/exploration trade-off for active learning in the context of fraud

detection, which has so far been overlooked in the literature.

An Ensemble Generation MethodBased on Instance Hardness

Felipe N. Walmsley , George D. C. Cavalcanti , Dayvid V. R. Oliveira , Rafael M. O. Cruz , Robert Sabourin

Comments: Paper accepted for publication on IJCNN 2018

Subjects

:

Learning (cs.LG)

; Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

In Machine Learning, ensemble methods have been receiving a great deal of

attention. Techniques such as Bagging and Boosting have been successfully

applied to a variety of problems. Nevertheless, such techniques are still

susceptible to the effects of noise and outliers in the training data. We

propose a new method for the generation of pools of classifiers based on

Bagging, in which the probability of an instance being selected during the

resampling process is inversely proportional to its instance hardness, which

can be understood as the likelihood of an instance being misclassified,

regardless of the choice of classifier. The goal of the proposed method is to

remove noisy data without sacrificing the hard instances which are likely to be

found on class boundaries. We evaluate the performance of the method in

nineteen public data sets, and compare it to the performance of the Bagging and

Random Subspace algorithms. Our experiments show that in high noise scenarios

the accuracy of our method is significantly better than that of Bagging.

GritNet: Student Performance Prediction with Deep Learning

Byung-Hak Kim , Ethan Vizitei , Varun Ganapathi Subjects : Learning (cs.LG) ; Computers and Society (cs.CY); Machine Learning (stat.ML)

Student performance prediction – where a machine forecasts the future

performance of students as they interact with online coursework – is a

challenging problem. Reliable early-stage predictions of a student’s future

performance could be critical to facilitate timely educational interventions

during a course. However, very few prior studies have explored this problem

from a deep learning perspective. In this paper, we recast the student

performance prediction problem as a sequential event prediction problem and

propose a new deep learning based algorithm, termed GritNet, which builds upon

the bidirectional long short term memory (BLSTM). Our results, from real

Udacity students’ graduation predictions, show that the GritNet not only

consistently outperforms the standard logistic-regression based method, but

that improvements are substantially pronounced in the first few weeks when

accurate predictions are most challenging.

Sampling-free Uncertainty Estimation in Gated Recurrent Units with Exponential Families

Seong Jae Hwang , Ronak Mehta , Vikas Singh

Comments: First version. Submitted to ECCV 2018

Subjects

:

Learning (cs.LG)

; Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

There has recently been a concerted effort to derive mechanisms in vision and

machine learning systems to offer uncertainty estimates of the predictions they

make. Clearly, there are enormous benefits to a system that is not only

accurate but also has a sense for when it is not sure. Existing proposals

center around Bayesian interpretations of modern deep architectures — these

are effective but can often be computationally demanding. We show how classical

ideas in the literature on exponential families on probabilistic networks

provide an excellent starting point to derive uncertainty estimates in Gated

Recurrent Units (GRU). Our proposal directly quantifies uncertainty

deterministically, without the need for costly sampling-based estimation. We

demonstrate how our model can be used to quantitatively and qualitatively

measure uncertainty in unsupervised image sequence prediction. To our

knowledge, this is the first result describing sampling-free uncertainty

estimation for powerful sequential models such as GRUs.

Nonparametric Stochastic Compositional Gradient Descent for Q-Learning in Continuous Markov Decision Problems

Alec Koppel , Ekaterina Tolstaya , Ethan Stump , Alejandro Ribeiro Subjects : Learning (cs.LG) ; Systems and Control (cs.SY); Machine Learning (stat.ML)

We consider Markov Decision Problems defined over continuous state and action

spaces, where an autonomous agent seeks to learn a map from its states to

actions so as to maximize its long-term discounted accumulation of rewards. We

address this problem by considering Bellman’s optimality equation defined over

action-value functions, which we reformulate into a nested non-convex

stochastic optimization problem defined over a Reproducing Kernel Hilbert Space

(RKHS). We develop a functional generalization of stochastic quasi-gradient

method to solve it, which, owing to the structure of the RKHS, admits a

parameterization in terms of scalar weights and past state-action pairs which

grows proportionately with the algorithm iteration index. To ameliorate this

complexity explosion, we apply Kernel Orthogonal Matching Pursuit to the

sequence of kernel weights and dictionaries, which yields a controllable error

in the descent direction of the underlying optimization method. We prove that

the resulting algorithm, called KQ-Learning, converges with probability 1 to a

stationary point of this problem, yielding a fixed point of the Bellman

optimality operator under the hypothesis that it belongs to the RKHS. Under

constant learning rates, we further obtain convergence to a small Bellman error

that depends on the chosen learning rates. Numerical evaluation on the

Continuous Mountain Car and Inverted Pendulum tasks yields convergent

parsimonious learned action-value functions, policies that are competitive with

the state of the art, and exhibit reliable, reproducible learning behavior.

Improving Supervised Bilingual Mapping of Word Embeddings

Armand Joulin , Piotr Bojanowski , Tomas Mikolov , Edouard Grave Subjects : Computation and Language (cs.CL) ; Learning (cs.LG)

Continuous word representations, learned on different languages, can be

aligned with remarkable precision. Using a small bilingual lexicon as training

data, learning the linear transformation is often formulated as a regression

problem using the square loss. The obtained mapping is known to suffer from the

hubness problem, when used for retrieval tasks (e.g. for word translation). To

address this issue, we propose to use a retrieval criterion instead of the

square loss for learning the mapping. We evaluate our method on word

translation, showing that our loss function leads to state-of-the-art results,

with the biggest improvements observed for distant language pairs such as

English-Chinese.

ADef: an Iterative Algorithm to Construct Adversarial Deformations

Rima Alaifari , Giovanni S. Alberti , Tandri Gauksson

Comments: 10 pages, 5 figures

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Cryptography and Security (cs.CR); Learning (cs.LG); Machine Learning (stat.ML)

While deep neural networks have proven to be a powerful tool for many

recognition and classification tasks, their stability properties are still not

well understood. In the past, image classifiers have been shown to be

vulnerable to so-called adversarial attacks, which are created by additively

perturbing the correctly classified image.

In this paper, we propose the ADef algorithm to construct a different kind of

adversarial attack created by iteratively applying small deformations to the

image, found through a gradient descent step. We demonstrate our results on

MNIST with a convolutional neural network and on ImageNet with Inception-v3 and

ResNet-101.

Unsupervised learning of the brain connectivity dynamic using residual D-net

Youngjoo Seo , Manuel Morante , Yannis Kopsinis , Sergios Theodoridis

Comments: 10 pages, 5 figueres and 3 tables, under review in MIDL 2018

Subjects

:

Machine Learning (stat.ML)

; Learning (cs.LG)

In this paper, we propose a novel unsupervised learning method to learn the

brain dynamics using a deep learning architecture named residual D-net. As it

is often the case in medical research, in contrast to typical deep learning

tasks, the size of the resting-state functional Magnetic Resonance Image

(rs-fMRI) datasets for training is limited. Thus, the available data should be

very efficiently used to learn the complex patterns underneath the brain

connectivity dynamics. To address this issue, we use residual connections to

alleviate the training complexity through recurrent multi-scale representation.

We conduct two classification tasks to differentiate early and late stage Mild

Cognitive Impairment (MCI) from Normal healthy Control (NC) subjects. The

experiments verify that our proposed residual D-net indeed learns the brain

connectivity dynamics, leading to significantly higher classification accuracy

compared to previously published techniques.

One-Shot Learning using Mixture of Variational Autoencoders: a Generalization Learning approach

Decebal Constantin Mocanu , Elena Mocanu

Journal-ref: 17th International Conference on Autonomous Agents and Multiagent

Systems (AAMAS 2018)

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Learning (cs.LG); Machine Learning (stat.ML)

Deep learning, even if it is very successful nowadays, traditionally needs

very large amounts of labeled data to perform excellent on the classification

task. In an attempt to solve this problem, the one-shot learning paradigm,

which makes use of just one labeled sample per class and prior knowledge,

becomes increasingly important. In this paper, we propose a new one-shot

learning method, dubbed MoVAE (Mixture of Variational AutoEncoders), to perform

classification. Complementary to prior studies, MoVAE represents a shift of

paradigm in comparison with the usual one-shot learning methods, as it does not

use any prior knowledge. Instead, it starts from zero knowledge and one labeled

sample per class. Afterward, by using unlabeled data and the generalization

learning concept (in a way, more as humans do), it is capable to gradually

improve by itself its performance. Even more, if there are no unlabeled data

available MoVAE can still perform well in one-shot learning classification. We

demonstrate empirically the efficiency of our proposed approach on three

datasets, i.e. the handwritten digits (MNIST), fashion products

(Fashion-MNIST), and handwritten characters (Omniglot), showing that MoVAE

outperforms state-of-the-art one-shot learning algorithms.

A Simple Quantum Neural Net with a Periodic Activation Function

Ammar Daskin

Comments: conference paper

Subjects

:

Quantum Physics (quant-ph)

; Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

In this paper, we propose a simple neural net that requires only (O(nlog_2k))

numbers of quantum gates and qubits: Here, (n) is the number of input

parameters, and (k) is the number of weights applied to these input parameters

in the proposed neural net. We describe the network in terms of a quantum

circuit, and then draw its equivalent classical neural net which involves

(O(k^n)) nodes in the hidden layer. Then, we show that the network uses a

periodic activation function of cosine values of the linear combinations of the

inputs and weights. The steps of the gradient descent are described, and then

Iris and Breast cancer datasets are used for the numerical simulations. The

numerical results indicate the network can be used in machine learning problems

and it may provide exponential speedup over the same structured classical

neural net.

MobileFaceNets: Efficient CNNs for Accurate Real-time Face Verification on Mobile Devices

Sheng Chen , Yang Liu , Xiang Gao , Zhen Han

Comments: To be submitted to SPL

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Learning (cs.LG)

In this paper, we present a class of extremely efficient CNN models called

MobileFaceNets, which use no more than 1 million parameters and specifically

tailored for high-accuracy real-time face verification on mobile and embedded

devices. We also make a simple analysis on the weakness of common mobile

networks for face verification. The weakness has been well overcome by our

specifically designed MobileFaceNets. Under the same experimental conditions,

our MobileFaceNets achieve significantly superior accuracy as well as more than

2 times actual speedup over MobileNetV2. After trained by ArcFace loss on the

refined MS-Celeb-1M from scratch, our single MobileFaceNet model of 4.0MB size

achieves 99.55% face verification accuracy on LFW and 92.59% TAR (FAR1e-6) on

MegaFace Challenge 1, which is even comparable to state-of-the-art big CNN

models of hundreds MB size. The fastest one of our MobileFaceNets has an actual

inference time of 18 milliseconds on a mobile phone. Our experiments on LFW,

AgeDB, and MegaFace show that our MobileFaceNets achieve significantly improved

efficiency compared with the state-of-the-art lightweight and mobile CNNs for

face verification.

Two Use Cases of Machine Learning for SDN-Enabled IP/Optical Networks: Traffic Matrix Prediction and Optical Path Performance Prediction

Gagan Choudhury , David Lynch , Gaurav Thakur , Simon Tse Subjects : Networking and Internet Architecture (cs.NI) ; Learning (cs.LG); Machine Learning (stat.ML)

We describe two applications of machine learning in the context of IP/Optical

networks. The first one allows agile management of resources at a core

IP/Optical network by using machine learning for short-term and long-term

prediction of traffic flows and joint global optimization of IP and optical

layers using colorless/directionless (CD) flexible ROADMs. Multilayer

coordination allows for significant cost savings, flexible new services to meet

dynamic capacity needs, and improved robustness by being able to proactively

adapt to new traffic patterns and network conditions. The second application is

important as we migrate our metro networks to Open ROADM networks, to allow

physical routing without the need for detailed knowledge of optical parameters.

We discuss a proof-of-concept study, where detailed performance data for

wavelengths on a current flexible ROADM network is used for machine learning to

predict the optical performance of each wavelength. Both applications can be

efficiently implemented by using a SDN (Software Defined Network) controller.

Unsupervised Representation Adversarial Learning Network: from Reconstruction to Generation

Yuqian Zhou , Kuangxiao Gu , Thomas Huang Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Learning (cs.LG); Machine Learning (stat.ML)

A good representation for arbitrarily complicated data should have the

capability of semantic generation, clustering and reconstruction. Previous

research has already achieved impressive performance on either one. This paper

aims at learning a disentangled representation effective for all of them in an

unsupervised way. To achieve all the three tasks together, we learn the forward

and inverse mapping between data and representation on the basis of a symmetric

adversarial process. In theory, we minimize the upper bound of the two

conditional entropy loss between the latent variables and the observations

together to achieve the cycle consistency. The newly proposed RepGAN is tested

on MNIST, fashionMNIST, CelebA, and SVHN datasets to perform unsupervised or

semi-supervised classification, generation and reconstruction tasks. The result

demonstrates that RepGAN is able to learn a useful and competitive

representation. To the author’s knowledge, our work is the first one to achieve

both a high unsupervised classification accuracy and low reconstruction error

on MNIST.

Randomized ICA and LDA Dimensionality Reduction Methods for Hyperspectral Image Classification

Chippy Jayaprakash , Bharath Bhushan Damodaran , Sowmya V , K P Soman

Comments: Submitted IEEE JSTARS

Subjects

:

Machine Learning (stat.ML)

; Learning (cs.LG)

Dimensionality reduction is an important step in processing the hyperspectral

images (HSI) to overcome the curse of dimensionality problem. Linear

dimensionality reduction methods such as Independent component analysis (ICA)

and Linear discriminant analysis (LDA) are commonly employed to reduce the

dimensionality of HSI. These methods fail to capture non-linear dependency in

the HSI data, as data lies in the nonlinear manifold. To handle this, nonlinear

transformation techniques based on kernel methods were introduced for

dimensionality reduction of HSI. However, the kernel methods involve cubic

computational complexity while computing the kernel matrix, and thus its

potential cannot be explored when the number of pixels (samples) are large. In

literature a fewer number of pixels are randomly selected to partial to

overcome this issue, however this sub-optimal strategy might neglect important

information in the HSI. In this paper, we propose randomized solutions to the

ICA and LDA dimensionality reduction methods using Random Fourier features, and

we label them as RFFICA and RFFLDA. Our proposed method overcomes the

scalability issue and to handle the non-linearities present in the data more

efficiently. Experiments conducted with two real-world hyperspectral datasets

demonstrates that our proposed randomized methods outperform the conventional

kernel ICA and kernel LDA in terms overall, per-class accuracies and

computational time.

Effects of sampling skewness of the importance-weighted risk estimator on model selection

Wouter M. Kouw , Marco Loog

Comments: Conference paper, 6 pages, 5 figures

Subjects

:

Machine Learning (stat.ML)

; Learning (cs.LG)

Importance-weighting is a popular and well-researched technique for dealing

with sample selection bias and covariate shift. It has desirable

characteristics such as unbiasedness, consistency and low computational

complexity. However, weighting can have a detrimental effect on an estimator as

well. In this work, we empirically show that the sampling distribution of an

importance-weighted estimator can be skewed. For sample selection bias

settings, and for small sample sizes, the importance-weighted risk estimator

produces overestimates for datasets in the body of the sampling distribution,

i.e. the majority of cases, and large underestimates for data sets in the tail

of the sampling distribution. These over- and underestimates of the risk lead

to suboptimal regularization parameters when used for importance-weighted

validation.

Generating Music using an LSTM Network

Nikhil Kotecha , Paul Young

Comments: 8 pages, 11 figures

Subjects

:

Sound (cs.SD)

; Learning (cs.LG); Audio and Speech Processing (eess.AS)

A model of music needs to have the ability to recall past details and have a

clear, coherent understanding of musical structure. Detailed in the paper is a

neural network architecture that predicts and generates polyphonic music

aligned with musical rules. The probabilistic model presented is a Bi-axial

LSTM trained with a kernel reminiscent of a convolutional kernel. When analyzed

quantitatively and qualitatively, this approach performs well in composing

polyphonic music. Link to the code is provided.

Information Theory

Mobile Edge Computing-Enabled Heterogeneous Networks

Chanwon Park , Jemin Lee

Comments: 12 pages, 11 figures, submitted to IEEE Transactions on Wireless Communications

Subjects

:

Information Theory (cs.IT)

The mobile edge computing (MEC) has been introduced for providing computing

capabilities at the edge of networks to improve the latency performance of

wireless networks. In this paper, we provide the novel framework for

MEC-enabled heterogeneous networks (HetNets) , composed of the multi-tier

networks with access points (APs) (i.e., MEC servers), which have different

transmission power and different computing capabilities. In this framework, we

also consider multiple-type mobile users with different sizes of computation

tasks, and they offload the tasks to a MEC server, and receive the computation

resulting data from the server. We derive the successful edge computing

probability considering both the computation and communication performance

using the queueing theory and stochastic geometry. We then analyze the effects

of network parameters and bias factors in MEC server association on the

successful edge computing probability. We provide how the optimal bias factors

in terms of successful edge computing probability can be changed according to

the user type and MEC tier, and how they are different to the conventional ones

that did not consider the computing capabilities and task sizes. It is also

shown how the optimal bias factors can be changed when minimizing the mean

latency instead of successful edge computing probability. This study provides

the design insights for the optimal configuration of MEC-enabled HetNets.

Achievable Information Rates for Nonlinear Fiber Communication via End-to-end Autoencoder Learning

Shen Li , Christian Häger , Nil Garcia , Henk Wymeersch

Comments: 3 pages, 4 figures, submitted to ECOC 2018

Subjects

:

Information Theory (cs.IT)

; Machine Learning (stat.ML)

Machine learning is used to compute achievable information rates (AIRs) for a

simplified fiber channel. The approach jointly optimizes the input distribution

(constellation shaping) and the auxiliary channel distribution to compute AIRs

without explicit channel knowledge in an end-to-end fashion.

On the Effects of Subpacketization in Content-Centric Mobile Networks

Adeel Malik , Sung Hoon Lim , Won-Yong Shin

Comments: 16 pages, 6 figures, To appear in the IEEE Journal on Selected Areas in Communications

Subjects

:

Information Theory (cs.IT)

; Networking and Internet Architecture (cs.NI)

A large-scale content-centric mobile ad hoc network employing

subpacketization is studied in which each mobile node having finite-size cache

moves according to the reshuffling mobility model and requests a content object

from the library independently at random according to the Zipf popularity

distribution. Instead of assuming that one content object is transferred in a

single time slot, we consider a more challenging scenario where the size of

each content object is considerably large and thus only a subpacket of a file

can be delivered during one time slot, which is motivated by a fast mobility

scenario. Under our mobility model, we consider a single-hop-based content

delivery and characterize the fundamental trade-offs between throughput and

delay. The order-optimal throughput-delay trade-off is analyzed by presenting

the following two content reception strategies: the sequential reception for

uncoded caching and the random reception for maximum distance separable

(MDS)-coded caching. We also perform numerical evaluation to validate our

analytical results. In particular, we conduct performance comparisons between

the uncoded caching and the MDS-coded caching strategies by identifying the

regimes in which the performance difference between the two caching strategies

becomes prominent with respect to system parameters such as the Zipf exponent

and the number of subpackets. In addition, we extend our study to the random

walk mobility scenario and show that our main results are essentially the same

as those in the reshuffling mobility model.

DFT-Based Hybrid Beamforming Multiuser Systems: Rate Analysis and Beam Selection

Yu Han , Shi Jin , Jun Zhang , Jiayi Zhang , Kai-Kit Wong Subjects : Information Theory (cs.IT)

This paper considers the discrete Fourier transform (DFT) based hybrid

beamforming multiuser system and studies the use of analog beam selection

schemes. We first analyze the uplink ergodic achievable rates of the

zero-forcing (ZF) receiver and the maximum-ratio combining (MRC) receiver under

Ricean fading conditions. We then examine the downlink ergodic achievable rates

for the ZF and maximum-ratio transmitting (MRT) precoders. The long-term and

short-term normalization methods are introduced, which utilize long-term and

instantaneous channel state information (CSI) to implement the downlink power

normalization, respectively. Also, approximations and asymptotic expressions of

both the uplink and downlink rates are obtained, which facilitate the analog

beam selection solutions to maximize the achievable rates. An exhaustive search

provides the optimal results but to reduce the time-consumption, we resort to

the derived rate limits and propose the second selection scheme based on the

projected power of the line-of-sight (LoS) paths. We then combine the

advantages of the two schemes and propose a two-step scheme that achieves near

optimal performances with much less time-consumption than exhaustive search.

Numerical results confirm the analytical results of the ergodic achievable rate

and reveal the effectiveness of the proposed two-step method.

Dynamic Power Splitting for SWIPT with Nonlinear Energy Harvesting in Ergodic Fading Channel

Jae-Mo Kang , Chang-Jae Chun , Il-Min Kim , Dong In Kim

Comments: 15 pages, 4 figures

Subjects

:

Information Theory (cs.IT)

We study the dynamic power splitting for simultaneous wireless information

and power transfer (SWIPT) in the ergodic fading channel. Considering the

nonlinearity of practical energy harvesting circuits, we adopt the realistic

nonlinear energy harvesting (EH) model rather than the idealistic linear EH

model. To characterize the optimal rate-energy (R-E) tradeoff, we consider the

problem of maximizing the R-E region, which is nonconvex. We solve this

challenging problem for two different cases of the channel state information

(CSI): (i) when the CSI is known only at the receiver (CSIR case) and (ii) when

the CSI is known at both the transmitter and the receiver (CSIT case). First,

for the case of CSIR, we develop the optimal dynamic power splitting scheme. To

address the complexity issue of the optimal scheme, we also propose a

suboptimal scheme with low complexity. Comparing the proposed schemes to the

existing schemes, we provide various useful and interesting insights into the

dynamic power splitting for the nonlinear EH. Second, we present the optimal

and suboptimal schemes for the case of CSIT, and we obtain further insights.

Numerical results demonstrate that the proposed schemes significantly

outperform the existing schemes and the proposed suboptimal scheme works very

close to the optimal scheme at a much lower complexity.

QoS Provisioning in Large Wireless Networks

Marios Kountouris , Nikolaos Pappas , Apostolos Avranas

Comments: 6 pages; conference publication

Subjects

:

Information Theory (cs.IT)

; Networking and Internet Architecture (cs.NI)

Quality of service (QoS) provisioning in next-generation mobile

communications systems entails a deep understanding of the delay performance.

The delay in wireless networks is strongly affected by the traffic arrival

process and the service process, which in turn depends on the medium access

protocol and the signal-to-interference-plus-noise ratio (SINR) distribution.

In this work, we characterize the conditional distribution of the service

process given the point process in Poisson bipolar networks. We then provide an

upper bound on the delay violation probability combining tools from stochastic

network calculus and stochastic geometry. Furthermore, we analyze the delay

performance under statistical queueing constraints using the effective capacity

formulation. The impact of QoS requirements, network geometry and link distance

on the delay performance is identified. Our results provide useful insights for

guaranteeing stringent delay requirements in large wireless networks.

Connectivity of Ad Hoc Wireless Networks with Node Faults

Satoshi Takabe , Tadashi Wadayama

Comments: 6pages, 3 figures

Subjects

:

Information Theory (cs.IT)

; Social and Information Networks (cs.SI)

Connectivity of wireless sensor networks (WSNs) is a fundamental global

property expected to be maintained even though some sensor nodes are at fault.

In this paper, we investigate the connectivity of random geometric graphs

(RGGs) in the node fault model as an abstract model of ad hoc WSNs with

unreliable nodes. In the model, each node is assumed to be stochastically at

fault, i.e., removed from a graph. As a measure of reliability, the network

breakdown probability is then defined as the average probability that a

resulting survival graph is disconnected over RGGs. We examine RGGs with

general connection functions as an extension of a conventional RGG model and

provide two mathematical analyses: the asymptotic analysis for infinite RGGs

that reveals the phase transition thresholds of connectivity, and the

non-asymptotic analysis for finite RGGs that provides a useful approximation

formula. Those analyses are supported by numerical simulations in the Rayleigh

SISO model reflecting a practical wireless channel.

Bias-variance tradeoff in MIMO channel estimation

Luc Le Magoarou (IRT b-com), Stéphane Paquelet (IRT b-com) Subjects : Signal Processing (eess.SP) ; Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

Channel estimation is challenging in multi-antenna communication systems,

because of the large number of parameters to estimate. It is possible to

facilitate this task by using a physical model describing the multiple paths

constituting the channel, in the hope of reducing the number of unknowns in the

problem. Adjusting the number of estimated paths leads to a bias-variance

tradeoff. This paper explores this tradeoff, aiming to find the optimal number

of paths to estimate. Moreover, the approach based on a physical model is

compared to the classical least squares and Bayesian techniques. Finally, the

impact of channel estimation error on the system data rate is assessed.

MIMO Channel Hardening: A Physical Model based Analysis

Matthieu Roy (IRT b-com), Stéphane Paquelet (IRT b-com), Luc Le Magoarou (IRT b-com), Matthieu Crussière (IETR, IRT b-com) Subjects : Networking and Internet Architecture (cs.NI) ; Information Theory (cs.IT)

In a multiple-input-multiple-output (MIMO) communication system, the

multipath fading is averaged over radio links. This well-known channel

hardening phenomenon plays a central role in the design of massive MIMO

systems. The aim of this paper is to study channel hardening using a physical

channel model in which the influences of propagation rays and antenna array

topologies are highlighted. A measure of channel hardening is derived through

the coefficient of variation of the channel gain. Our analyses and closed form

results based on the used physical model are consistent with those of the

literature relying on more abstract Rayleigh fading models, but offer further

insights on the relationship with channel characteristics.

欢迎加入我爱机器学习QQ14群:336582044

arXiv Paper Daily: Mon, 23 Apr 2018

微信扫一扫,关注我爱机器学习公众号

微博:我爱机器学习

原文  https://www.52ml.net/22372.html
正文到此结束
Loading...