- Úvod
- Ústav
- Lidé
- Výzkum
- Aplikace
- Semináře a akce
- Knihovna
- Doktorské studium
- Kariéra

Ústav informatiky

Akademie věd České republiky

Akademie věd České republiky

"Přemýšlet je ta nejtěžší práce, jaká existuje. A to je pravděpodobně důvod, proč to tak málo lidí dělá." - Henry Ford

HORA INFORMATICAE (rozuměj: ČAS PRO INFORMATIKU) je širokospektrální vědecký seminář věnovaný všem základním oblastem informatiky a jejím interdisciplinárním rozhraním s jinými vědami a aplikovanými oblastmi. Originální příspěvky zabývající se klasickými i nově vznikajícími tématy jsou vítány. Seminář, který založil Jiří Wiedermann, probíhá od roku 1994 v Ústavu informatiky AV ČR v Praze.

ORGANIZERS: F. Hakl, V. Kůrková, J. Wiedermann | horainf@cs.cas.cz | https://www.cs.cas.cz/horainf |

#### Learning from few examples - nonlinearity and dimensionality.

12.03.2024 14:00 CETVelká zasedačka (č. 318, 2. poschodí) Ústavu informatiky AV ČR, v.v.i.ZOOM 954 7823 4977 (Heslo: 712564)A lot of attention has been given to the curses and blessings of learning with high dimensional data, and in this talk we will examine how these affect a system's ability to learn from few examples. For example, proving tight bounds on the generalisation performance of even simple classifiers in a conventional distribution-agnostic way typically requires that the quantity of training data grows extremely quickly with the data dimension. The 'curse of dimensionality' embodied by this conventional wisdom would seem to suggest that it is not possible to learn such data from just a few examples - yet we frequently observe this happening in practice! Moreover, practical experience shows that using nonlinear feature maps which artificially inflate data up into higher dimensions often make the learning problem an easier one, something which seems to contradict the expected 'curses of dimensionality'. We reveal that there are in fact large families of data distributions with certain geometric properties where high dimensional concentration phenomena actually make learning and generalising from few examples easier in higher dimensions. We also show that it is possible for carefully designed feature maps to embed data into higher dimensions in a way which meaningfully accelerates these 'blessings of dimensionality'.

[LINK]

:

#### Dispersion of a point set - lower and upper bounds.

20.02.2024Given a point cloud in the unit cube of a d-dimensional Euclidean space, its dispersion is defined as the volume of the largest axis-parallel box, which does not intersect this cloud. For a fixed integer n, we try to minimize the dispersion among the point sets of cardinality n. We review recent upper and lower bounds of this quantity, which turned to be connected to discrete geometry, probability, combinatorics, and analysis. Especially, we discuss a new lower bound obtained by a reduction to a certain combinatorial problem solved previously by Ruszinkó, Alon, and Asodi.

[LINK]

: #### GPU-acceleration of program synthesis.

30.01.2024GPUs are the work-horses of high-performance computing. The acceleration they provide to applications compatible with their programming paradigm can surpass CPU performance by several orders of magnitude, as notably evidenced by the advancements in deep learning. A significant spectrum of applications, especially within automated reasoning has yet to reap the benefits of GPU acceleration. In order for an application to be “GPU-friendly”, it needs to have high parallelism, minimal data-dependent branching, and predictable data movement with substantial data locality. Current automated reasoning algorithms are predominantly branching-intensive and appear sequential in nature, but it is unclear whether they are inherently sequential, or can be adapted to GPUs. We identify program synthesis as a candidate for GPU-acceleration. Program synthesis is an umbrella term for the algorithmic generation of programs (and similar formal objects, like logical formulae) from specifications. We conjecture that program synthesis is especially suitable for GPU- acceleration because it is embarrassingly parallel: most program synthesis involves “generate-and- check” phases where vast numbers of candidate solutions are first generated, and then checked if they meet the target specification. Synthesis stops as soon as a suitable candidate is found. This can be done in parallel with only moderate amounts of synchronisation, making program synthesis embarrassingly parallel and potentially GPU friendly. The (typically) exponential growth in potential synthesis candidates will effortlessly saturate the compute cores of any future processor. In order to test our hypothesis, we implement two classic machine learning problems using program synthesis on a GPU (regular expression inference from examples and linear-temporal logic learning, also from examples). We observe several orders of magnitude speedup, and an ability to handle much larger instances.

[LINK]

: #### Getting Structure in Dialogue with Large Language Models.

23.01.2024The current state of the art in text generation are large language models (LLMs) pretrained on vast amounts of text and finetuned to produce solutions given instructions. LLMs represent significant progress, allowing the user to request outputs for various tasks by stating a query in natural language and being able to follow examples provided by the user (in-context learning/prompting), without the need for further training (finetuning) on task-specific data. However, they retain some of the problems of the previous generation of language models, in particular their opacity and lack of controllability. This talk will show experiments on using LLMs with prompting only for multiple tasks: data-to-text generation, task-oriented dialogues, and dialogue evaluation. All these tasks operate with structure (structured data input, structured outputs, structured dialogue), which is not what these LLMs were specifically pretrained for. I show that LLMs are usable for these tasks, but also point out their limitations and potential areas of improvement.

[LINK]

: #### Principlism as the ethics of artificial intelligence.

09.01.2024There are several ethical theories, but when it comes to practical efforts to regulate AI, only one ethical system is advocated, known as principlism. It has gradually been incorporated into both supranational and national documents on AI ethics. In my talk, I will introduce principlism and explore its practical applications. In particular, I will address sensitive areas of AI ethics, including privacy, transparency, and algorithmic bias. I aim to show that while the tendency to adopt a single ethical system for AI regulation is understandable, this approach may not be entirely sustainable in both theory and practice.

[LINK]

: #### The Voynich Manuscript: the secret book of an unknown language.

05.12.2023The Voynich Manuscript, a mysterious document of unknown origin, remains one of the greatest unsolved mysteries in the history of linguistics and cryptography. This presentation takes a multidisciplinary look at the manuscript, examining its physical characteristics, content, and illustrations, and presents recent attempts to decipher its unknown language. Particular attention is paid to modern technologies such as artificial intelligence and spectroscopy that may provide new insights. This presentation also presents in-house research that contributes to further understanding of this enigmatic document, combining linguistic analysis and advanced data processing algorithms. The aim is to provide a comprehensive overview of the current state of knowledge and to open a discussion on possible future research directions.

[LINK]

: #### Chomsky-Like Neural Network Hierarchy.

07.11.2023The computational power of neural networks (NNs) depends on the Kolmogorov complexity of their weight parameters. We will present a Chomsky-like hierarchy for NNs with α analog neurons (αANN), which complements the analysis for realistic models between integer (finite automata) and rational weights (Turing machines). We achieved the separation of the first two levels 1ANN, 2ANN of this hierarchy and proved its collapse to 3ANN. This research also motivated the definitions of two new interesting concepts that will be introduced in the lecture. Namely, we will define a periodic number in positional system with non-integer radix and digits. We will present a new notion of a C-simple problem which is a conceptual counterpart to the standard complexity-theoretic definition of C-hard problems, e.g. NP-hard problems. (This is a joint work with Petr Savický, Petr Jančar, and Martin Plátek).

[LINK]

: #### Approximation of classifiers of large data sets by deep ReLU networks.

24.10.2023Rapid development of experimental research and successful practical applications of deep networks inspires many theoretical questions. In this talk, we will focus on approximation capabilities of deep ReLU networks, which are one of the most popular network architectures. We will explore the effect of network depths and numbers of their parameters on behavior of approximation errors. To obtain probabilistic bounds on approximation errors, we will employ concepts from statistical learning theory (growth function, VC dimension) and high-dimensional geometry (concentration of measure). We will address the dilemma between approximation accuracy and consistency in learning from random samples of data. We will discuss limitations of approximation capabilities of networks of finite VC dimension in distribution agnostic settings.

[LINK]

: #### Artificial Wisdom: On the Power of Generative AI.

10.10.2023In this lecture, we explore the purpose and potential of artificial intelligence (AI) in light of the current approaches to generative AI. We argue that the respective models can be seen as tools for acquiring and generating artificial wisdom, enabling us to make wiser decisions and behave more intelligently. Unlike earlier AI approaches, which focused on generating knowledge from data, contemporary language models have the ability to extract meaning from syntactic patterns and relate this meaning to real-world descriptions. This capacity reflects a form of cognition known in biology as 4E cognition, which emphasizes the embodied, embedded, extended, and enacted nature of intelligent behavior. We argue that contemporary large language models possess a form of illusory intelligence and illusory wisdom that has not been described in the literature before. By recognizing the potential of generative AI to generate and use wisdom, we can move beyond knowledge-centric approaches to AI and develop more nuanced models of intelligent behavior.

[LINK]

: #### HW Accelerated AI Inference and Fast, Recursive QR System Identification.

13.06.2023We will explain design, implementation and performance of HW accelerated AI inference with 8- bit, fixed point, neural networks for object detection (resnet50) and face detection (dense- box_640_360) on System-on-Chip (SoC) AMD-Xilinx Zynq UltraScale+ device. It is 4-core ARM A53 64bit processor with on-chip programmable logic. We will also explain implementation and performance of HW accelerated, fast, recursive, adaptive system identification QR Lattice algorithms. Algorithms are implemented as pipelined systolic arrays on our floating point FP32 SIMD data processing engines (DPUs) on Zynq UltraScale+ device.

[LINK]

: #### Causality and Machine Learning.

30.05.2023Despite the recent success and widespread applications of machine learning (ML) algorithms for classification and prediction in a variety of fields, they face difficulty in interpretability, trustworthiness and generalization. One of the main reasons for this is that these algorithms are building black-box models by learning statistical associations between the given 'input' and its 'output'. Decisions made solely based on 'associational learning' are insufficient to provide explanations and hence difficult to be employed in real world tasks requiring transparency and reliability. To overcome these limitations of ML algorithms, researchers are moving towards 'causal machine learning' by aiding ML decision-making based on causal reasoning and understanding. We will discuss 'the science of causality', its requirements in ML and possible means of integration with ML. We will also compare different ML algorithms based on their performance in learning temporal order/ structure in single time series as well as their ability to classify coupled pairs of time-series based on their cause-effect (or driver-driven) relationship.

[LINK]

: #### TNL: Numerical library for modern parallel architectures.

16.05.2023TNL is a collection of building blocks that facilitate the development of efficient numerical solvers and HPC algorithms. It is implemented in C++ using modern programming paradigms in order to provide a flexible and user-friendly interface similar to, for example, the C++ Standard Template Library. TNL provides native support for modern hardware architectures such as multicore CPUs, GPUs, and distributed systems, which can be managed via a unified interface. In our presentation, we will demonstrate the main features of the library together with efficiency of the implemented algorithms and data structures.

[LINK]

: #### Computability of Neural Network Problems.

02.05.2023Despite the overwhelming success of deep learning methods in practice, an increasing number of concerns is being raised concerning the reliability of neural networks and therefore the safety of their employment in various applications. These uncertainties, also reflected in the current EU AI Act proposal and the highly publicised open letter Pause Giant AI Experiments, result from a lack of theoretical understanding and performance guarantees. A fundamental example of this is a series of recent results implying that for many continuous problems, including practical applications such as MRI scanning and 3D image reconstruction, there exists no algorithm on digital hardware with unlimited resources that could guarantee correct learning of a solving neural network, even in cases where such a network exists. In this talk we discuss some of these issues, as well as attempts to resolve them and their relevance to practice.

[LINK]

: #### Zero-cost proxies for neural network performance estimation.

18.04.2023In neural architecture search (NAS), the task is to find the best-performing architecture for a given deep learning task. The bottleneck of the search is the time-consuming training and evaluation of networks. The costs can be reduced by using performace predictors, which are models that map an architecture to its trained performance. Recently, zero-cost proxies have been proposed as very fast predictors-network performance is estimated using a score that is computed by passing a single minibatch of data to an untrained network. In this presentation, I will present the zero-cost proxies and their current role in NAS, as well as my ongoing work on zero-cost proxy ensembling.

[LINK]

: #### GPT et al.: Generating Texts with Transformer-Based Large Language Models.

04.04.2023This presentation explores the use of large transformer-based language models for generating texts. We will discuss the history of language modeling and how the recent advances in recurrent neural networks (RNNs) and transformer networks have enabled the development of powerful language models. We will also examine the use of subwords and attention mechanisms in these models. Finally, we will present the THEaiTRE project, which uses transformer-based language models to generate theatre play scripts.

[LINK]

: #### From ridge functions to neural networks II.

21.03.2023The mathematics of the performance of neural networks in high-dimensional problems presents subtle challenges and many open problems. We discuss identification of a sum of the so-called ridge functions, which in turn corresponds to one layer of an artificial neural network. We show, that identification of such functions can be done by considering certain matrix decomposition as opposed to tensors of the third order used frequently in the literature. Finally, we present a construction of a multivariate Riesz basis of ReLU neural networks, which performs equally well independently on the dimension.

[LINK]

: #### Learning Logic Programs with negation, Predicate Invention, and Higher-Order Definitions Through the learning from Failures Paradigm.

07.03.2023Inductive Logic Programming (ILP) is a form of symbolic machine learning that learns clausal theories to explain sets of positive and negative evidence. Learning complex clausal theories that generalize well remains a formidable challenge. Extending the language over which the clausal theories are constructed results, not only in improved generalization, but also the ability to solve tasks that cannot be properly formulated in the simpler environment. Problematically, such language extensions lead to unsoundness of existing approaches. In this talk, we will introducing the learning from failures paradigm (developed by Andrew Cropper and Rolf Morel) and described extension of the approach that soundly handle the addition of negation, Predicate Invention, and Higher-Order Definitions. (Joint work with Stanisław J. Purgał, Cezary Kaliszyk, and Andrew Cropper).

[LINK]

: #### From discrete to continuous variational inequalities with convex and non-convex constraints.

21.02.2023Constrained optimization problems can often be stated in terms of variational inequalities. The talk will present a method based on the Kurzweil integral calculus which allows to solve discrete and continuous evolution variational inequalities under the same formalism in both the convex and the non-convex case. The results have been published recently in cooperation with G. A. Monteiro (IM CAS Prague) and V. Recupero (Politecnico Torino).

[LINK]

: #### From ridge functions to neural networks I.

24.01.2023The mathematics of the performance of neural networks in high-dimensional problems presents subtle challenges and many open problems. We start with a discussion of the so-called ridge functions, which are just a composition of a uni-variate function with an inner product. In some sense, they model one artificial neuron. We show that although this model looks rather simple, the identification of such function might suffer the curse of dimension. Next we discuss identification of a sum of ridge functions, which in turn corresponds to one layer of an artificial neural network. Finally, the last result, which we present in this talk, reveals a multivariate Riesz basis of ReLU neural networks, which performs equally well independently on the dimension.

[LINK]

: #### How radar interferometry could reconcile fuzzy sets with probability

10.01.2023Introduction to radar interferometry will be presented in the framework of two different competitive data processing approaches. Processing of interferometric radar data has induced some serious practical problems that can be solved with the aid of the fuzzy sets theory as well as by means of the probability theory. As a consequence of comparison of the both approaches, probabilistic interpretation of fuzzy sets will be exposed. Moreover, fuzzy approach can serve as an approximate tool for evaluation of complicated, intractable integrals that frequently occur in fully probabilistic solution. Some other, non-traditional tools for radar data processing will be mentioned, namely spatial statistics.

[LINK]

: #### Probabilistic Forecasting with International Large-Scale Assessments: Applications to the UN Sustainable Development Goals

20.12.2022In 2015, the Member States of the United Nations (UN) adopted the Sustainable Development Goals. With regard to education, the UN identified equitable, high-quality education, including the achievement of literacy and numeracy by all youth and a substantial proportion of adults, both men and women, as one of its global SDGs to be attained by 2030. To analyze education policies such as these, it is critically important to monitor trends in educational outcomes over time. Indeed, as educational systems around the world face new challenges due to the COVID-19 pandemic, monitoring trends in educational outcomes could help identify the long-run impact of this unprecedented health crisis on global education. To this end, international large-scale assessment programs such as PISA are uniquely situated to provide population-level trend data on literacy and numeracy outcomes. The purpose of this talk is to describe a new project in collaboration with the University of Heidelberg and funded by the US Institute of Education Sciences. This project proposes a methodology applicable to international large-scale assessments, and PISA in particular, to monitor and forecast changes in gender equity and to relate changes over time in gender equity to policy- relevant predictors and exogenous events such as the coronavirus pandemic. We utilize a Bayesian workflow to account for uncertainty in all steps in the modeling process, including uncertainty in the parameters of the model as well as model uncertainty in the choice of policy-relevant predictors. A proof-of-concept using data from the United States NAEP program provides a demonstration of the ideas.

[LINK]

: #### Model M - an agent-based epidemiological model

06.12.2022During the recent pandemic, the interest in epidemiological modeling rapidly increased. Epidemiological models improve our understanding of the dynamics of disease spread and help us during the design of various protective measures. The important family of models is formed by agent-based models. They provide simulation tools for detailed modeling of individual human behaviour. We will present our network agent model called model M. It works with a population of individuals (agents) and their contacts are modeled as a multi-graph social network according to real data based on a Czech county. Custom algorithmic procedures simulating testing, quarantine and partial closures of various contact types are implemented. The model can serve as a tool for relative comparison of the efficacy of various policies. It was also used for a study comparing various interventions in Czech primary and secondary schools, using a graph based on real data from a selected Czech school.

[LINK]

: #### Some implications of high-dimensional geometry for classification by neural networks

22.11.2022Computational difficulties of multidimensional tasks, called the ``curse of dimensionality’’, have long been known. On the other hand, almost deterministic behavior of some randomized models and algorithms depending on large numbers of variables can be attributed to the ``blessing of dimensionality’’. These phenomena can be explained by rather counter-intuitive properties of geometry of high-dimensional spaces. They imply concentration of values of sufficiently smooth functions of many variables around their mean values and possibilities of reduction of dimensionality of data by random projections. In the lecture, it will be shown how these properties of high-dimensional geometry can be employed to obtain some insights into suitability of various types of neural networks for classification of large data sets. Probabilistic bounds on network complexity will be derived using concentration properties of approximation errors based on Azuma and McDiarmid inequalities. Consequences for choice of network architectures will be analyzed in terms of growth functions and VC dimensions of sets of network input-output functions. General results will be illustrated by examples of deep perceptron networks with various piecewise polynomial activation functions (ReLU, RePU).

[LINK]

: #### Autonomous machines and machine minds

08.11.2022Autonomous machines, like robots and self-driving cars, are machines that perform behaviors or tasks with a high degree of autonomy (without, or with minimal external influence). Such machines operate and cooperate in the real world while coping with unpredictable external phenomena and error-prone technology. We model any autonomous machine as a cyber-physical system controlled by universal processors implementing so-called minimal machine consciousness, minimal machine experience, and minimal machine understanding. In the talk, we present in more detail the respective model, define and justify our machine versions of consciousness, experience, and understanding and indicate their computational realization. We argue that endowing the systems with all these cognitive feats leads to more trustworthy, safer, and more reliable systems, by increasing their behavioral flexibility and autonomy in varying environments and by strengthening their resilience to operation or cooperation failures of their components or as a whole. The talk is based on joint research with Jan van Leeuwen, Utrecht, NL.

[LINK]

: #### How to deal with data contamination in hypothesis testing

16.12.2019The theory of testing statistical hypotheses aims not only at proposing suitable tests, but also (and mainly) at deriving their optimal procedures under certain conditions. Any type of data contamination however requires non-standard approaches to testing. This talk, which is intended for non-statisticians, will very briefly overview principles of several recent results, on which I participated. To be specific, particular results include: Minimax tests under measurement errors, Locally optimal tests based on sequential ranks, Tests based on ranks of interpoint distances among high-dimensional observations, Symmetry tests based on robust multivariate estimators.

: #### Probabilistic Bounds on Complexity of Networks Classifying Large Data Sets

18.11.2019The number of binary classification tasks on a finite domain (representing a set of vectors of features, measurements, or observations) grows exponentially with its size. However for a given application area, relevance of many such tasks might be low or negligible. The lecture will investigate a probabilistic framework modeling a prior knowledge about probabilities that a presence of some features implies a property described by one of the classes. Impact of increasing sizes of domains on correlations between input-output mappings of computational models and randomly-chosen classifiers will be analyzed. It will be shown that on large domains, correlations between tasks and computational units are sharply concentrated around their mean values. Probabilistic bounds on correlations will be derived using various concentration inequalities, some of them holding also without the “naive Bayes assumption”. It will be shown that performance of random classifiers is almost deterministic, in the sense that either a given class of computational models can approximate well almost all tasks or none of them. Consequences for the choice of optimal computational models will be discussed.

: #### Interpolant generation algorithms based on quantifier elimination

01.07.2019In their paper in 2006, Kapur, Majumdar and Zarba observed a connection between quantifier elimination and interpolant generation which was probably well-known but not explicitly reported in the literature on automated reasoning and formal methods. Since then I have been investigating how to develop heuristics for quantifier elimination to generate interpolants. Particularly, there is no need to have access to a proof to generate interpolants, a methodology widely used in the formal methods community. I will start with an interpolant generation algorithm in the quantifier-free theory of equality over uninterpreted symbols (EUF). This is followed by an interpolant generation algorithm for octagonal formulas, which is of complexity O(n^3), where n is the number of variables; an interpolant generated is a conjunction of octagonal formulas. Then, I will move to nonlinear (quadratic) polynomial inequalities and their combination with EUF. The key idea is that nonlinear concave quadratic polynomials can be linearized and a generalization of Motzkin's transposition theorem can be proved. Time permitting, I will discuss our recent work on using linear and nonlinear deep learning methods based on support vector machines, exploring a totally different approach for generating nonlinear interpolants from more complex formulas including transendental functions.

: #### Automata and Turing Computation with Synfire Ring-Based Neural Networks

04.03.2019Synfire rings are layered looping neuronal structures that allow for robust, temporally precise and self-sustained spiking activities. We show that finite state automata and Turing machines can be simulated by recurrent neural networks composed of synfire rings. While the original constructions involve simple Boolean neurons, we investigate the generalization of the results to the more biological contexts of Izhikevich and Hodgkin-Huxley neural nets. These considerations show that a neuro-inspired paradigm for abstract computation is possible and potentially exploitable. They might constitute a very first step towards the realization of neuronal computers.

: #### Deep Neural Networks in Natural Language Processing

14.01.2019Similarly to other fields, Natural language processing has recently been revolutionized by the introduction of Deep learning. In my talk, I will focus on two specific properties of natural text as input to a machine learning system, and the current ways in which these are addressed in Deep neural networks:

- Representing massively multi-valued discrete data (words) by continuous low-dimensional vectors (word embeddings).
- Handling variable-length input sequences with long-distance relations between elements (sentences) by fixed-sized neural units (attention mechanisms).

:

ORGANIZERS: F. Hakl, V. Kůrková, J. Wiedermann E-mail: horainf@cs.cas.cz Web page: https://www.cs.cas.cz/horainf