# Comfirmed Speakers

# Speakers and Speeches Information

University of California, Berkeley

**Title: **Lossless compression of graphical data

**Abstract: **
(joint work with Payam Delgosha)
Many data sources arising from social networks, communication networks, and
networks describing biological processes are indexed by the underlying graphs,
rather than as time series. These graphs are often sparse, i.e. the number of edges is
on the same scale as the number of vertices.
The local weak limit theory for sparse graphs allows us to think of such graphical data
as drawn from an analog of a stochastic process.
Just as a stochastic process is a stochastic model for a time series got by
picking a time index at random and viewing how the time series looks from that time
index, in the local weak limit theory one studies the graphical data by picking a node
of the graph at random and seeing how the data looks from the point of view of that node.
We illustrate the theory and this viewpoint by discussing
two problems of lossless data compression of graphical data.
Consider a sparse graph whose vertices and edges carry marks, viewed as data samples.
We first pose the problem of representing the graphical data
in a universal sense, i.e., without making any
statistical assumptions about the data being compressed. It turns out that one can
do this and, what is more, the compression can be done
in the most efficient way possible. Namely, one can make sense of a notion of entropy
(due to Bordenave and Caputo) associated to the local weak limit,
which is on a linear scale in the number of nodes, and one can compress the graphical data
down to this entropy in a universal sense.
We next prove an analog
of the Slepian-Wolf theorem of distributed compression for graphical data. Here
we asssume that multiple agents see partial views of a sample from a graphical data source
modeled as either an Erdos-Renyi ensemble or a configuration model ensemble. The agents have
to compress their individual observations so that a central receiver, getting the individual
compressed representations, can perfectly recover the data source with probability
close to 1. We characterize the achievable rate region in terms of the Bordenave-Caputo
entropy.

**Bio: **
Venkat Anantharam is on the faculty of the EECS Department at
the University of California, Berkeley.
He received the Philips India Medal and the President of India Gold Medal from
IIT Madras in 1980 and an NSF Presidential Young Investigator award in 1988.
He is a corecipient of the 1998 Prize Paper Award of the
IEEE Information Theory Society, and a corecipient of the 2000
Stephen O. Rice Prize Paper Award of the IEEE Communications Theory Society.
He received the Distinguished Alumnus Award from IIT Madras in 2008.
He is a Fellow of the IEEE.

Princeton University

**Title: **Random initialization in solving quadratic systems of equations

**Abstract: **
Recent years have seen a flurry of activities in designing provably efficient nonconvex procedures for solving statistical estimation problems. Due to the highly nonconvex nature of the empirical loss, state-of-the-art procedures often require suitable initialization and proper regularization (e.g. trimming, regularized cost, projection) in order to guarantee fast convergence. For vanilla procedures such as gradient descent, however, prior theory is often either far from optimal or completely lacks theoretical guarantees.
This talk is concerned with a striking phenomenon arising in solving random quadratic systems of equations (a.k.a. phase retrieval): even in the absence of careful initialization and/or explicit regularization, gradient descent converges to the optimal solution within a logarithmic number of iterations, thus achieving near-optimal statistical and computational guarantees at once. All of these are achieved by exploiting the statistical models in analyzing optimization algorithms, via a leave-one-out approach that enables the decoupling of certain statistical dependency between the gradient descent iterates and the data.
This is joint work with Cong Ma, Kaizheng Wang, Yuejie Chi, and Jianqing Fan.

**Bio: **
Yuxin Chen is currently an assistant professor in the Department of Electrical Engineering at Princeton University. Prior to joining Princeton, he was a postdoctoral scholar in the Department of Statistics at Stanford University, and he completed his Ph.D. in Electrical Engineering at Stanford University. His research interests include high-dimensional data analysis, convex and nonconvex optimization, statistical learning, and information theory.

Carnegie Mellon University

**Title: **Blind Deconvolution with Geometric Priors

**Abstract: **
Blind deconvolution aims to recover both the input signal and the convolution kernel from their convolutional outputs. A classical inverse problem, it finds applications in various areas of signal processing, wireless communications, and image processing. Often, it is necessary to impose additional geometric priors on the unknowns in order to alleviate the ill-posedness of the problem. This talk will present a few case studies on solving blind deconvolution problems under subspace, sparsity and shift-invariance constraints using both convex and nonconvex optimization, with provable performance guarantees.

**Bio: **
Dr. Yuejie Chi received the Ph.D. degree in Electrical Engineering from Princeton University in 2012, and the B.E. (Hon.) degree in Electrical Engineering from Tsinghua University, Beijing, China, in 2007. Since January 2018, she is Robert E. Doherty Career Development Professor and Associate Professor with the department of Electrical and Computer Engineering at Carnegie Mellon University, after spending 5 years at The Ohio State University. She is a recipient of IEEE Signal Processing Society Young Author Best Paper Award, NSF CAREER Award, AFOSR YIP Award, and ONR YIP Award. She is an Elected Member of the MLSP and SPTM Technical Committees of the IEEE Signal Processing Society since January 2016. Her research interests are in the mathematics of data representation that take advantage of structures and geometry to minimize complexity and improve performance. Specific topics include mathematical and statistical signal processing, machine learning, large-scale optimization, sampling and information theory, with applications in sensing, imaging and data science.

King Juan Carlos University

**Title: **Learning graphs from nodal observations via graph filter identification

**Abstract: **
The problem addressed in this talk is that of learning an unknown network from nodal observations, which are modeled as graph signals generated by linear diffusion dynamics that depend on the topology of the sought graph. We assume that the observed diffused signals can be modeled as the output of a linear graph filter, applied to a set of independent input graph signals with arbitrarily-correlated components. In this context, we first rely on observations of the output signals along with prior statistical information on the inputs to identify the diffusion filter. Critical to this end is the formulation of the problem as the solution of a system of quadratic equations. Then, we leverage that linear graph filters are polynomials of the so-called graph-shift operator (a matrix representation of the network topology) to recover the graph via a convex optimization problem. We address three cases of increasing complexity: undirected networks with white inputs, undirected networks with colored inputs, and directed networks with arbitrary inputs. The resultant problems can be recast as sparse recovery problem with either Boolean constraints (for the undirected case) or manifold constraints (for the directed case). Numerical tests corroborating the effectiveness of the proposed algorithms in recovering synthetic and real-world directed graphs are provided.

**Bio: **
Antonio G. Marques received the telecommunications engineering degree and the Doctorate degree, both with highest honors, from the Carlos III University of Madrid, Madrid, Spain, in 2002 and 2007, respectively. In 2007, he became a faculty in the Department of Signal Theory and Communications, King Juan Carlos University, Madrid, Spain, where he currently develops his research and teaching activities as an Associate Professor. From 2005 to 2015, he held different visiting positions at the University of Minnesota, Minneapolis, MN, USA. In 2015 and 2016, he was a Visitor Scholar in the University of Pennsylvania, Philadelphia, PA, USA. His research interests lie in the areas of signal processing, networking and communications. His current research focuses on stochastic optimization of wireless and power networks, signal processing for graphs, and nonlinear network optimization. He has served the IEEE in a number of posts, collaborating on the organization of more than 20 IEEE conferences and workshops. Currently, he is an Associate Editor of the SIGNAL PROCESSING LETTERS, a member of the IEEE Signal Processing Theory and Methods Technical Committee and a member of the IEEE Signal Processing for Big Data Special Interest Group. Dr. Marques’ work has been awarded in several conferences and workshops, with recent best paper awards including Asilomar 2015, IEEE SSP 2016 and IEEE SAM 2016.

University of Minnesota

**Title: **Topology Identification and Learning over Graphs: Accounting for Nonlinearities and Dynamics

**Abstract: **
Learning the topology of graphs as well as processes evolving over graphs are tasks emerging in application domains as diverse as gene-regulatory, brain, power, and social networks, to name a few. Scalable approaches to deal with such high-dimensional settings aim to address the unique modeling and computational challenges associated with data-driven science in the modern era of big data analytics. Albeit simple and tractable, linear time-invariant models are limited as they are incapable of modeling changing topologies, as well as nonlinear and dynamic dependencies between nodal processes. To this end, novel approaches are presented to leverage nonlinear counterparts of partial correlation and partial Granger causality, as well as nonlinear structural equations and vector auto-regressions, along with attributes such as low rank, sparsity, and smoothness to capture even directional dependencies with abrupt change points, as well as dynamic processes over possibly time-evolving topologies. The unifying framework inherits the versatility and generality of kernel-based methods, and lends itself to batch and computationally affordable online learning algorithms, which include novel Kalman filters and smoothers over graphs. Real data experiments highlight the impact of the nonlinear and dynamic models on gene-regulatory and functional connectivity of brain networks, where connectivity patterns revealed exhibit discernible differences relative to existing approaches.

**Bio: **
Georgios B. Giannakis (Fellow’97) received his Diploma in Electrical Engr. from the Ntl. Tech. Univ. of Athens, Greece, 1981. From 1982 to 1986 he was with the Univ. of Southern California (USC), where he received his MSc. in Electrical Engineering, 1983, MSc. in Mathematics, 1986, and Ph.D. in Electrical Engr., 1986. He was with the University of Virginia from 1987 to 1998, and since 1999 he has been a professor with the Univ. of Minnesota, where he holds a Chair in Wireless Telecommunications, a University of Minnesota McKnight Presidential Chair in ECE, and serves as director of the Digital Technology Center. His general interests span the areas of communications, networking and statistical signal processing – subjects on which he has published more than 400 journal papers, 700 conference papers, 25 book chapters, two edited books and two research monographs (h-index 130). Current research focuses on big data analytics, wireless cognitive radios, network science with applications to social, brain, and power networks with renewables. He is the (co-) inventor of 32 patents issued, and the (co-) recipient of 9 best paper awards from the IEEE Signal Processing (SP) and Communications Societies, including the G. Marconi Prize Paper Award in Wireless Communications. He also received Technical Achievement Awards from the SP Society (2000), from EURASIP (2005), a Young Faculty Teaching Award, the G. W. Taylor Award for Distinguished Research from the University of Minnesota, and the inaugural IEEE Fourier Technical Field Award (2015). He is a Fellow of EURASIP, and has served the IEEE in a number of posts including that of a Distinguished Lecturer for the IEEE-SP Society.

Tsinghua University

**Title: **Timely-Throughput Optimal Scheduling with Prediction

**Abstract: **
Motivated by the increasing importance of providing delay-guaranteed services in general computing and communication systems, and the recent wide adoption of learning and prediction in network control, in this work, we consider a general stochastic single-server multi-user system and investigate the fundamental benefit of predictive scheduling in improving timely-throughput, being the rate of packets that are delivered to destinations before their deadlines. By adopting an error rate-based prediction model, we first derive a Markov decision process (MDP) solution to optimize the timely-throughput objective subject to an average resource consumption constraint. Based on a packet-level decomposition of the MDP, we explicitly characterize the optimal scheduling policy and rigorously quantify the timely-throughput improvement due to predictive-service, which scales as $\Theta(p\left[C_{1}\frac{(a-a_{\max}q)}{p-q}\rho^{\tau}+C_{2}(1-\frac{1}{p})\right](1-\rho^{D}))$, where $a, a_{\max}, \rho\in(0, 1), C_1>0, C_2\ge0$ are constants, $p$ is the true-positive rate in prediction, $q$ is the false-negative rate, $\tau$ is the packet deadline and $D$ is the prediction window size. We also conduct extensive simulations to validate our theoretical findings. Our results provide novel insights into how prediction and system parameters impact performance and provide useful guidelines for designing predictive low-latency control algorithms.

**Bio: **
Dr. Longbo Huang is an assistant professor at the Institute for Interdisciplinary Information Sciences (IIIS) at Tsinghua University, Beijing, China. He received his Ph.D. in EE from the University of Southern California, and then worked as a postdoctoral researcher in the EECS dept. at University of California at Berkeley before joining IIIS. Dr. Huang was a visiting scientist at the Simons Institute for the Theory of Computing at UC Berkeley in Fall 2016.
Dr. Huang has held various visiting positions at the LIDS lab at MIT, the Chinese University of Hong Kong, Bell-labs France, and Microsoft Research Asia (MSRA). He was selected into China’s Youth 1000-talent program in 2013, and received the outstanding teaching award from Tsinghua university in 2014. Dr. Huang received the Google Research Award and the Microsoft Research Asia Collaborative Research Award in 2014, and was selected into the MSRA StarTrack Program in 2015. Dr. Huang has served as TPC members for top-tier IEEE and ACM conferences, and as the TPC vice-chair for submissions for IEEE WiOpt 2016, the Publicity Chair for ACM e-Energy 2017-2018, and the Workshop Chair for Sigmetrics 2018. Dr. Huang was the lead guest editor for the JSAC special issue on “Human-In-The-Loop Mobile Networks” in 2016. He currently serves as an editor for IEEE Transactions on Communications (TCOM), and an associate editor for ACM Transactions on Modeling and Performance Evaluation of Computing Systems (ToMPECS). Dr. Huang’s current research interests are in the areas of stochastic modeling and analysis, optimization and control, machine learning, and sharing economy.

Tsinghua-Berkeley Shenzhen Institute

**Title: **Feature Projection : An Information Theoretic Interpretation of Deep Neural Networks

**Abstract: **
It is commonly believed that the hidden layers of deep neural networks (DNN) attempt to extract informative features for learning tasks. In this talk, we make this intuition precise by showing that the features extracted by DNN coincide with the result of an optimization problem, which we call the "universal feature selection" problem. We interpret the learning in DNN as projection of feature functions onto functional subspaces, specified by the network structure. Our formulation has several advantages: 1) it has direct operational meaning in terms of the performance for inference tasks; 2) it can be solved by alternative numerical procedures and sometimes even analytically, providing rich alternatives to DNN implementations; 3) it gives interpretations to the internal computation results of DNN; and 4) it naturally leads to a performance metric that can be applied to general DNN as well as other feature selection algorithms, which can be used to evaluate the effectiveness of the DNN structures and improve designs. We present the results of some numerical experiments to support our approach.

**Bio: **
Shao-Lun Huang received the B.S. degree with honor in 2008 from the Department of Electronic Engineering, National Taiwan University, Taiwan, and the M.S. and Ph.D. degree in 2010 and 2013 from the Department of Electronic Engineering and Computer Sciences, Massachusetts Institute of Technology. From 2013 to 2016, he was working as a postdoctoral researcher jointly in the Department of Electrical Engineering at the National Taiwan University and the Department of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology. Since 2016, he has joined Tsinghua-Berkeley Shenzhen Institute, where he is currently an assistant professor. His research interests include information theory, communication theory, machine learning, and social networks.

University of California, Irvine

**Title: **Spatial Deployment of Nodes in Heterogeneous Sensor Networks: A Source Coding Perspective

**Abstract: **
Wireless networks of the future, such as the Internet of Things, are envisioned to be highly heterogeneous.
In many applications, one is interested in optimally deploying a network of nonidentical nodes to a certain area of interest. These networks may include a multitude of connected autonomous nodes in one or more tiers. We formulate these deployment problems as quantizer design problems where different distortion measures should be associated with different quantization indices.
We discuss fundamental design challenges like the best spatial deployment of nodes to minimize the energy consumption or maximize the sensing accuracy while guaranteeing network connectivity. This is done by developing a quantization theory of heterogeneous reproduction points.
We will discuss the characteristics of such a heterogeneous quantization theory and show that some of the standard results of the traditional quantization theory, including Gersho's conjecture, do not hold.

**Bio: **
Hamid Jafarkhani is a Chancellor's Professor at the Department of Electrical Engineering and Computer Science, University of California, Irvine, where he is also the Director of Center for Pervasive Communications and Computing and the Conexant-Broadcom Endowed Chair. Among his awards are the IEEE Marconi Prize Paper Award in Wireless Communications, the IEEE Communications Society Award for Advances in Communication, and the IEEE Eric E. Sumner Award.
Dr. Jafarkhani is listed as a highly cited researcher in http://www.isihighlycited.com.
According to the Thomson Scientific, he is one of the top 10 most-cited researchers in the field of "computer science" during 1997-2007.
He is the 2017 Innovation Hall of Fame Inductee at the University of Maryland's School of Engineering, a Fellow of AAAS, an IEEE Fellow, and the author of the book "Space-Time
Coding: Theory and Practice."

University of California, San Diego

**Title: **Universal Probability and Context-Based Methods

**Abstract: **
One of the simplest, yet most powerful approaches in data processing (such as compression, prediction, filtering, and estimation) of sequential data with spatiotemporal memory (text, image, biological sequences, and time series) is to first parse a given sequence according to a context model and then apply symbol-by-symbol solutions for each context independently. This approach is attractive for two reasons. On the practical side, context-based parsing is simple and straightforward, yet it can extend most existing symbol-by-symbol algorithms, which are designed based on the assumption that the data is i.i.d., to the data with memory with minimal increase in complexity. On the theoretical side, the performance guarantee of any symbol-by-symbol algorithm for memoryless data can be easily translated into a performance guarantee for data with memory, establishing ``universality'' with respect to a broad class of unknown data distributions.
Most context-based algorithms, however, suffer the ``sparse context'' problem. As we increase the context size, which is necessary to capture more spatiotemporal dependence in given data, the number of contexts increases exponentially and thus each context has too few samples to learn the structure of the data reliably. As this problem becomes more severe when the alphabet size is large.
The talk discusses how the sparse-context problem can be solved by efficient aggregation of contexts. A few concrete examples, such as real applications of grayscale image denoising and biological sequence classification, will be presented to showcase the proposed approach.

**Bio: **
Young-Han Kim received his B.S. degree in Electrical Engineering from Seoul National University in 1996 and his Ph.D. degree in Electrical Engineering (M.S. degrees in Statistics and in Electrical Engineering) from Stanford University in 2006. Since then he has been a faculty member in the Department of Electrical and Computer Engineering at the University of California, San Diego, where he is currently a Professor.
Professor Kim is a recipient of the NSF CAREER Award (2008), the US-Israel BSF Bergmann Memorial Award (2009), the IEEE Information Theory Paper Award (2012), and the first IEEE James L. Massey Research and Teaching Award (2015). He is an IEEE Fellow. His research interests include information theory, communication engineering, and data science. He has coauthored the book Network Information Theory (Cambridge University Press, 2011), which has been used widely as a textbook on the topic.

Technical University of Munich

**Title: **Information Theory for Selected Optical Fiber Models

**Abstract: **
Fiber optic channels are a complex medium with frequency dependent nonlinearity, dispersion, loss, amplification, and noise. We develop information theory for three simplified models that we hope provides useful insight on realistic models. The first model is used to compute information rates for optically-routed networks by simulation. Although the results are old by now (developed in 2006), they provide essentially the best rates to date. The second model assumes bandlimited propagation, and it lets us develop a spectral efficiency upper bound that is tight at power levels that are usually present in today’s systems. The third model is dispersion-free, and we use it to motivate studying the standard filter-and-sample model of communication theory.

**Bio: **
Gerhard Kramer is Alexander von Humboldt Professor and Chair of Communications Engineering at the Technical University of Munich (TUM). He received the B.Sc. and M.Sc. degrees in electrical engineering from the University of Manitoba, Canada, in 1991 and 1992, respectively, and the Dr. sc. techn. degree from the ETH Zurich, Switzerland, in 1998. From 1998 to 2000, he was with Endora Tech AG in Basel, Switzerland, and from 2000 to 2008 he was with the Math Center at Bell Labs in Murray Hill, NJ. He joined the University of Southern California (USC), Los Angeles, CA, as a Professor of Electrical Engineering in 2009. He joined TUM in 2010. Gerhard Kramer’s research interests are primarily in information theory and communications theory, with applications to wireless, copper, and optical fiber networks.

Delft University of Technology

**Title: **Statistical Inference through Sparse Sensing

**Abstract: **
Ubiquitous sensors generate prohibitively large data sets. Large volumes of such data are nowadays generated by a variety of applications such as imaging platforms and mobile devices, surveillance cameras, social networks, power networks, to list a few. In this era of data deluge, it is of paramount importance to gather only the data that is informative for a specific task in order to limit the required sensing cost, as well as the related costs of storing, processing, or communicating the data. The main goal of this talk is therefore to present topics that transform classical sensing methods, often based on Nyquist-rate sampling, to more structured low-cost sparse sensing mechanisms designed for specific inference tasks, such as estimation, filtering, and detection. More specifically, we present fundamental tools to achieve the lowest sensing cost with a guaranteed performance for the task at hand. Applications can be found in the areas of radar, multi-antenna communications, remote sensing, and medical imaging.

**Bio: **
Geert Leus received the M.Sc. and Ph.D. degree in Electrical Engineering from the KU Leuven, Belgium, in June 1996 and May 2000, respectively. Geert Leus is now an "Antoni van Leeuwenhoek" Full Professor at the Faculty of Electrical Engineering, Mathematics and Computer Science of the Delft University of Technology, The Netherlands. His research interests are in the broad area of signal processing, with a specific focus on wireless communications, array processing, sensor networks, and graph signal processing. Geert Leus received a 2002 IEEE Signal Processing Society Young Author Best Paper Award and a 2005 IEEE Signal Processing Society Best Paper Award. He is a Fellow of the IEEE and a Fellow of EURASIP. Geert Leus was a Member-at-Large of the Board of Governors of the IEEE Signal Processing Society, the Chair of the IEEE Signal Processing for Communications and Networking Technical Committee, and the Editor in Chief of the EURASIP Journal on Advances in Signal Processing. He was also on the Editorial Boards of the IEEE Transactions on Signal Processing, the IEEE Transactions on Wireless Communications, the IEEE Signal Processing Letters, and the EURASIP Journal on Advances in Signal Processing. Currently, he is a Member of the IEEE Sensor Array and Multichannel Technical Committee, an Associate Editor of Foundations and Trends in Signal Processing, and the Editor in Chief of EURASIP Signal Processing.

Texas A&M University

**Title: **Info-Clustering: An Information-Theoretic Paradigm for Data Clustering

**Abstract: **
The problem of clustering random variables is considered from an information-theoretic perspective. A new paradigm that utilizes the notion of multivariate mutual information is proposed. We show that the choice of a multivariate mutual information measure inspired by the classical problem of multi-terminal secret key agreement leads to clustering solutions that are guaranteed to be hierarchical. Furthermore, we show that the clustering solutions can be computed in strongly polynomial time (under the entropy oracle) via submodular function optimizations. Finally, we demonstrate that under the well-known Chow-Liu tree approximation on the joint distribution of the random variables, the proposed clustering solutions reduce to those of clustering by mutual information relevance networks, a well-known algorithm for discovering functional genomic clusters in computational biology.

**Bio: **
Tie Liu received his B.S. (1998) and M.S. (2000) degrees, both in Electrical Engineering, from Tsinghua University, Beijing, China and a second M.S. degree in Mathematics (2004) and a Ph.D. degree in Electrical and Computer Engineering (2006) from the University of Illinois at Urbana-Champaign. Since August 2006 he has been with Texas A&M University, where he is currently a Professor in the Department of Electrical and Computer Engineering. His primary research interest is in the area of information theory and statistical information processing. Dr. Liu received an M. E. Van Valkenburg Graduate Research Award (2006) from the University of Illinois at Urbana-Champaign and a CAREER Award (2009) from the National Science Foundation. He was a Technical Program Committee Co-Chair for the 2008 IEEE GLOBECOM, a General Co-Chair for the 2011 IEEE North American School of Information Theory, and an Associate Editor for Shannon Theory for the IEEE Transactions on Information Theory during 2014-2016.

The Chinese University of Hong Kong

**Title: **The AND-OR interference channel or the sand-glass conjecture

**Abstract: **
This talk concerns the achievable rates for a deterministic interference channel
where the inputs are binary and one receiver receives the (logical) AND of the two inputs while the other receiver receives the (logical) OR of the two inputs. This is the only deterministic interference channel setting (up to isomorphism) with binary inputs and outputs whose capacity region has not been established. For the zero-probability-of-error criterion, the sand-glass conjecture by Ahlswede and Simonyi postulates that the capacity region for this setting is given by the time-division line. (Note: In the remainder of the abstract, as is the norm in information theory, capacity region refers to the set of achievable rates where the probability of error goes to zero asymptotically in the block length.)
The first observation in the talk is that a careful evaluation of an outer bound for the interference channel by Etkin and Ordentlich (2009) shows that the sum-rate of the capacity region is upper bounded by 1.1805 (the authors had done a sub-optimal computation and got a larger number). Surprisingly, this is an improvement on the best upper bound for the zero-error version of this problem of 1.1815 obtained in 2017 as these two versions were studied by different communities. The second observation is that a judicious choice of the auxiliaries in the Han--Kobayashi achievable region for the capacity region for this channel shows that a sum-rate of 1.015 is achievable.
In this talk, the main result is an improvement of the outer bound of the sum-rate for
the capacity region to 1.146 using a non-standard and tailor-made analysis using multi-letter expressions.
Acknowledgements: This work was motivated by conversations with Janos Korner and Abbas El Gamal. Mehdi Yazdanpanah, a Ph.D. candidate, helped carry out the numerical simulations that enabled one to make the first two observations.

**Bio: **
Chandra Nair's research interests and contributions have been in developing ideas, tools, and techniques to tackle families of combinatorial and non-convex optimization problems arising primarily in the information sciences. His recent research focus has been on studying the optimality of certain inner and outer bounds to capacity regions for fundamental problems in multiuser information theory.
Chandra Nair got his Bachelor's degree from IIT Madras (India) where he was the Philips (India) and Siemens (India) award winner for the best academic performance; and his master's and Ph.D. from Stanford, where he was a Stanford graduate fellow (00-04) and a Microsoft graduate fellow (04-05). He is a fellow of IEEE and is currently a distinguished lecturer of the IEEE Information theory society. He received the 2016 Information Theory Society paper award for developing a novel way to establish optimality of Gaussian distributions.
Chandra Nair is an Associate Professor with the Information Engineering department at The Chinese University of Hong Kong, where he also serves as the Programme Director of the undergraduate program on Mathematics and Information Engineering and as the Director of the Institute of Theoretical Computer Science and Communications.

George Mason University

**Title: **Sparse Signal Processing for Wireless Communications

**Abstract: **
Sparse signal processing has demonstrated its usefulness in wireless communications over recent years. In the emerging era of data deluge, wireless systems such as 5G and Internet of Things (IoT) have to be able to sense and process an unprecedentedly large amount of data in real time, which render traditional communication and signal processing techniques inefficient or inapplicable. Meanwhile, there are exciting new developments on the theory and algorithms of sparse signal processing and compressive sensing, which offer powerful tools to effectively deal with high-dimensional signals, large-size problems, and big-volume data. This talk focuses on recent new results on sparse signal processing, including structure-based compressive sensing beyond sparsity, compressive covariance sensing and super-resolution gridless compressive sensing. Sparse signal processing principles and techniques are applied to various wireless applications where signal and information acquisition costs are high, such as wideband spectrum sensing in cognitive radios, high-dimensional direction-of-arrival estimation, and sparse channel estimation using large-antenna arrays in both millimeter-wave communication systems and IoT applications.

**Bio: **
Dr. Zhi Tian has been a Professor in the Electrical and Computer Engineering Department of George Mason University since 2015. Previously she was on the faculty of Michigan Technological University, and served a three-year term as a Program Director at the US National Science Foundation. Her research interests lie in statistical signal processing, wireless communications, and decentralized network optimization. She is an IEEE Fellow. She is Chair of the IEEE Signal Processing Society Big Data Special Interest Group. She was General Co-Chair of the IEEE GlobalSIP Conference in 2016. She served as an IEEE Distinguished Lecturer, and Associate Editor for the IEEE Transactions on Wireless Communications and IEEE Transactions on Signal Processing.

The Chinese University of Hong Kong

**Title: **The Explicit Coding Rate Region of Symmetric Multilevel Diversity Coding

**Abstract: **
It is well known that superposition coding, namely separately encoding the independent sources, is optimal for symmetric multilevel diversity coding (SMDC) (Yeung-Zhang 1999) for any L ≥ 2, where L is the number of levels of the coding system. However, the characterization of the coding rate region therein involves uncountably many linear inequalities and the constant term (i.e., the lower bound) in each inequality is given in terms of the solution of a linear optimization problem. Thus this implicit characterization of the coding rate region does not enable the determination of the achievability of a given rate tuple. In principle, the achievability of a given rate tuple can be determined by direct computation, but the complexity is prohibitive even for L = 5. In a very recent work (Guo-Yeung 2018), for any fixed L, we obtain in closed form a finite set of linear inequalities for characterizing the coding rate region. We further show by the symmetry of the problem that only a much smaller subset of this finite set of inequalities needs to be verified in determining the achievability of a given rate tuple. Yet, the cardinality of this smaller set grows at least exponentially fast with L.

**Bio: **
Raymond W. Yeung is a Choh-Ming Li Professor of Information Engineering at The Chinese University of Hong Kong (CUHK). He received his B.S., M.Eng., and PhD degrees from Cornell University in electrical engineering in 1984, 1985, and 1988, respectively. Before joining CUHK in 1991, he was a Member of Technical Staff at AT&T Bell Laboratories. A co-founder of the field of network coding, he has been serving as Co-Director of the Institute of Network Coding at CUHK since 2010. He is the author of the books A First Course in Information Theory (Kluwer Academic/Plenum Publishers, 2002) and Information Theory and Network Coding (Springer 2008), which have been adopted by over 100 institutions around the world. In spring 2014, he gave the first MOOC in the world on information theory that reached over 25,000 students.
He was a recipient of the 2005 IEEE Information Theory Society Paper Award, the Friedrich Wilhelm Bessel Research Award from the Alexander von Humboldt Foundation in 2007, the 2016 IEEE Eric E. Sumner Award (Citation: “For pioneering contributions to the field of network coding”), and the 2018 ACM SIGMOBILE Test-of-Time Paper Award. In 2015, he was named an Outstanding Overseas Chinese Information Theorist by the China Information Theory Society. He is a Fellow of the IEEE, Hong Kong Academy of Engineering Sciences, and Hong Kong Institution of Engineers.

University of Florida

**Title: **Knowledge Centric Networking: From Internet+ to AI+

**Abstract: **
In the creation of a smart future information society, Internet of Things (IoT) and Content Centric Networking (CCN) break two key barriers for both the front-end sensing and back-end networking. However, we still observe the missing piece of the research that dominates the current design, i.e., lacking of the knowledge penetrated into both sensing and networking to glue them holistically. In this talk, I will introduce and discuss a new networking paradigm, called Knowledge Centric Networking (KCN), as a promising solution. The key insight of KCN is to leverage emerging machine learning or artificial intelligence (AI) techniques to create knowledge for networking system designs, and extract knowledge from collected data to facilitate enhanced system intelligence and interactivity, improved quality of service, communication with better controllability, and lower cost. This talk presents the KCN design rationale, the KCN benefits and also the potential research opportunities.

**Bio: **
Dapeng Oliver Wu received Ph.D. in Electrical and Computer Engineering from Carnegie Mellon University, Pittsburgh, PA, in 2003. Since 2003, he has been on the faculty of Electrical and Computer Engineering Department at University of Florida, Gainesville, FL, where he is currently Professor. His research interests are in the areas of networking, communications, video coding, image processing, computer vision, signal processing, and machine learning.
He received University of Florida Term Professorship Award in 2017, University of Florida Research Foundation Professorship Award in 2009, AFOSR Young Investigator Program (YIP) Award in 2009, ONR Young Investigator Program (YIP) Award in 2008, NSF CAREER award in 2007, the IEEE Circuits and Systems for Video Technology (CSVT) Transactions Best Paper Award for Year 2001, the Best Paper Award in GLOBECOM 2011, and the Best Paper Award in QShine 2006. Currently, he serves as Editor-in-Chief of IEEE Transactions on Network Science and Engineering, and Associate Editor of IEEE Transactions on Communications, IEEE Transactions on Signal and Information Processing over Networks, and IEEE Signal Processing Magazine. He was the founding Editor-in-Chief of Journal of Advances in Multimedia between 2006 and 2008, and an Associate Editor for IEEE Transactions on Circuits and Systems for Video Technology, IEEE Transactions on Wireless Communications and IEEE Transactions on Vehicular Technology. He has served as Technical Program Committee (TPC) Chair for IEEE INFOCOM 2012. He was elected as a Distinguished Lecturer by IEEE Vehicular Technology Society in 2016. He is an IEEE Fellow.

Purdue University

**Title: **Securing Distributed Machine Learning in High Dimensions

**Abstract: **
We consider security issues in a distributed machine learning system wherein the data is kept confidential by its providers who are recruited as workers to help the learner to train a $d$--dimensional model. In each communication
round, up to $q$ out of the $m$ workers suffer Byzantine faults; faulty workers are assumed to have complete knowledge of the system and can collude to behave arbitrarily adversarially against the learner. We assume that each worker keeps a local sample of size $n$. (Thus, the total number of data points is $N=nm$.) Of particular interest is the high-dimensional regime $d \gg n$.
We propose a secured variant of the classical gradient descent method which can tolerate up to a constant fraction of Byzantine workers. We show that the estimation error of the iterates converges to an estimation error $O(\sqrt{q/N} + \sqrt{d/N})$ in $O(\log N)$ rounds. The core of our method is a robust gradient aggregator based on the iterative filtering algorithm proposed by Steinhardt et al. \cite{Steinhardt18} for robust mean estimation. We establish a uniform concentration of the sample covariance matrix of gradients, and show that the aggregated gradient, as a function of model parameter, converges uniformly to the true gradient function. As a by-product, we develop a new concentration inequality for sample covariance matrices of sub-exponential distributions, which might be of independent interest.

**Bio: **
Jiaming Xu is an assistant professor in the Krannert School of Management at Purdue University which he joined in Fall 2016. Before that, he was a research fellow with the Simons Institute for the Theory of Computing, UC Berkeley, and a postdoctoral fellow with the Statistics Department, The Wharton School, University of Pennsylvania. He received the Ph.D. degree from UIUC in 2014 under the supervision of Prof. Bruce Hajek, the M.S. degree from UT-Austin in 2011, and the B.E. degree from Tsinghua University, all in ECE. His research interests include high-dimensional statistical inference, information theory, convex and non-convex optimization, queueing theory, and game theory.

University of Toronto

**Title: **Sporadic Device Activity Detection for Massive Connectivity

**Abstract: **
In this talk, we examine the role of non-orthogonal multiple-access for a massive device communication scenario in which a large number of devices need to connect to an access-point, but user traffic is sporadic so that at any given time only a subset of users are active. For such a system, user activity detection and channel estimation are key issues. This talk discusses the role of compressed sensing techniques for joint device activity detection and channel estimation, and characterizes the detection and data transmission performance of an approximate message passing (AMP) based device identification method. This talk further considers the massive connectivity problem in the massive MIMO regime. We analytically show that massive MIMO can significantly enhance user activity detection, but the non-orthogonality of pilot sequences can nevertheless introduce significant channel estimation error, hence limiting the overall transmission rate. We quantify this effect and characterize the optimal pilot length for massive uncoordinated device access.

**Bio: **
Wei Yu received the B.A.Sc. degree in Computer Engineering and Mathematics from the University of Waterloo, Waterloo, Ontario, Canada in 1997 and M.S. and Ph.D. degrees in Electrical Engineering from Stanford University, Stanford, CA, in 1998 and 2002, respectively. Since 2002, he has been with the Electrical and Computer Engineering Department at the University of Toronto, Canada, where he is now Professor and holds a Canada Research Chair (Tier 1) in Information Theory and Wireless Communications. Prof. Wei Yu currently serves on the IEEE Information Theory Society Board of Governors. He is currently the Chair of the Signal Processing for Communications and Networking Technical Committee of the IEEE Signal Processing Society. He is an Area Editor for the IEEE Transactions on Wireless Communications (2017-20). Prof. Wei Yu was an IEEE Communications Society Distinguished Lecturer (2015-16). He received the IEEE Communications Society Best Tutorial Paper Award in 2015, the IEEE Signal Processing Society Best Paper Award in 2008 and 2017, and the Journal of Communications and Networks Best Paper Award in 2017. Prof. Wei Yu is a Fellow of IEEE, a member of the College of New Scholars, Artists and Scientists of the Royal Society of Canada, and a Fellow of Canadian Academy of Engineering.

Georgia Institute of Technology

**Title: **Towards Understanding Overparameterized Nonconvex Optimization in Machine Learning

**Abstract: **
Training deep neural networks requires solving highly complicated nonconvex optimization problems, which were used to be considered as missions impossible. However, numerous evidences have shown that simple stochastic gradient descent algorithms (in conjunction with several important computational heuristics such as step size annealing and batch/weight normalization) are quite affordable to obtain good solutions, which generalize well without suffering from overfitting. Due to the limits of human knowledge, however, these complicated neural network models are still beyond our understanding at this moment.
This talk approaches the overparameterization challenge from above and below: First, we investigate generalization bounds of popular deep neural networks, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and generative adversarial networks (GANs), and discuss their implications and insufficiencies; Second, we study simpler but nontrivial overparametrized models in machine learning, including matrix factorization, principal component analysis, deep linear networks, which allows us to make progress toward understanding and gaining more insights of overparameterization in deep neural networks.

**Bio: **
This talk approaches the overparameterization challenge from above and below: First, we investigate generalization bounds of popular deep neural networks, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and generative adversarial networks (GANs), and discuss their implications and insufficiencies; Second, we study simpler but nontrivial overparametrized models in machine learning, including matrix factorization, principal component analysis, deep linear networks, which allows us to make progress toward understanding and gaining more insights of overparameterization in deep neural networks.