3rd IASC world conference on
Computational Statistics & Data Analysis
Amathus Beach Hotel, Limassol, Cyprus, 28-31 October, 2005
 
Title:  Analysis of Symbolic and Structured Data

Description:

With the advent of the “information age”, we have witnessed a dramatic growth of applications in government, business and education, many of which are sources of various data, organised in different structures and formats. As a consequence, there is an increasing need to extend standard exploratory, statistical and graphical data analysis methods to more complex data, that go beyond the classical framework, which is characterized by a relatively simple representation of data, such as a database relation or a standard data table.

This is the case of data concerning more or less homogeneous classes or groups of individuals (second-order objects or macro-data), instead of single individuals (first-order objects or micro-data). The extension of classical data analysis techniques to the analysis of second-order objects is one of the main goals of a novel research field named Symbolic Data Analysis. Symbolic data extend the classical tabular model, allowing multiple, possibly weighted, values for each descriptive attribute which allow representing variability and/or uncertainty present in the data. Symbolic Data Analysis methods include univariate descriptive methods, clustering, decision-tree, discrimination, regression and factorial analysis techniques, which allow analysing symbolic data tables.

A particular type of structured data is represented by taxonomic attributes, that is, attributes whose categories are ordered in a rooted hierarchical tree, called taxonomy. On the other hand, dependencies may exist between variables. These dependencies may be logical (e.g. if colour is blue then type is river), causal (e.g. if
driving speed is high, then the number of accidents is high with probability 0.8) or hierarchical, expressing that the applicability of one variable depends on the values taken by another one (e.g. (if gender is male then the number of pregnancies is non-applicable). A more complex representation is given by first-order logic, where both attributes of single individuals and relations between individuals are represented.

Structured data arise from many different domains, such as official statistics, for the handling of census data, survey data, where questions are often dependent on each other, data warehouses, GIS applications, XML documents or genomic databases.

This track is meant to present contributions and stimulate discussion on the statistics and analysis of symbolic and structured data.

Co-Chairs:

Paula Brito
Faculdade de Economia
University of Porto
Rua Dr. Roberto Frias
4200-424 Porto
Portugal
Fax : (+351) 225505050
e-mail : mpbrito@fep.up.pt

Lynne Billard
University of Georgia at Athens
Statistics
102 Statistics Building
Athens, Georgia, USA
e-mail: lynne@stat.uga.edu
Edwin Diday
LISE-Ceremade
Université Paris-IX Dauphine
Pl. du M.al de Lattre de Tassigny
75016 Paris, France
e-mail : diday@ceremade.dauphine.fr
Georges Hébrail
Ecole Nationale Supérieure des Télécommunications
Département Informatique et Réseaux
46, rue Barrault
75634 Paris Cedex 13, France
e-mail : hebrail@enst.fr
Donato Malerba
Dipartimento di Informatica
Università degli Studi di Bari
Via Orabona 4
70126 Bari, Italy
e-mail: malerba @ di.uniba.it
 
ml>