Graphs and Paths: Professor Peter Wood’s inaugural lecture

This post was contributed by Riccardo Frosini, PhD student in Birkbeck’s Department of Computer Science and Information Systems. On 6 June, Riccardo attended the inaugural lecture of Professor Peter Wood. Here he reports on the lecture.

Cloud computing

The Department of Computer Science and Information Systems has announced the promotion to professor of Dr Peter T. Wood. The inaugural lecture to celebrate the event was attended by many colleagues, friends and students.

Professor Wood received his BSc and MSc in Computer Science from the University of Cape Town. In 1989 he obtained his PhD at the University of Toronto. He started working at Birkbeck in 2001, where he was Head of School from September 2006 until July 2009.

During the lecture, the audience could recognise his commitment and passion to his field of study as well as his humble personality. His lecture was focused on graph theory, in particular the recurring topic that has kept him busy over the years: Queries on Graphs.

Querying graphs

Professor Wood introduced the lecture by showing the importance of graph theory in history: from the Bridges of Koenigsberg problem to the more recent Linked Open Data project.

A graph is a collection of nodes or entities which are connected by edges. Entities can be an abstract representation of real world data and the edges represent the connections between them. Graphs are also used in social networks, geographical information, computer program analysis, and general knowledge representation.

One example shown during the lecture was an airline graph, where the entities represent airports and the edges represent flight connections. In this example, the edges can be labelled with different values, such as the number of miles between the airports or the airline companies operating the route.

During his PhD, Professor Wood investigated the computational complexity of querying such graphs.

One of his contributions at that time was to propose the use of regular expressions for finding paths in graphs. He proved that this problem is hard to solve for computers when nodes are not allowed to repeat on paths, but also that the problem becomes tractable when the regular expressions are restricted in certain ways

Querying trees

Professor Peter Wood

Professor Peter Wood

The second topic of research covered by Professor Wood during the lecture was optimising queries on trees.

Trees are particular kinds of graphs. A simple example is the family tree where each node represents a person and each connection is a parent-child relationship. One common language for representing trees is XML (eXtensible Modelling Language) which is often used for data exchange on the internet.

To define constraints on the allowed structure of XML, a schema or DTD (Document Type Definition) often is used, both of which also use regular expressions.

One common query language for XML is called XPath. If an XPath query does not conform to the DTD constraints, then it will not return any result when posed against an XML file. These kinds of queries are called “unsatisfiable”.

Professor Wood has proved that, in general, is computationally hard to check if an XPath query is satisfiable. During his research, he discovered a particular property for DTD’s called “covering”. This property was not only very common in real-world DTDs, but also made checking XPath satisfiability tractable for certain fragments of the XPath language.

Flexible Querying

Professor Wood’s most recent work, undertaken jointly with a number of colleagues and PhD students, is in flexible querying. They added flexible querying to the standard query language for Linked Open Data (a collection of graph databases), called SPARQL 1.1 (which also supports regular expressions).

The flexible querying technique in SPARQL helps users to find answers to queries on Linked Open Data datasets, when they are not familiar with the structure and/or vocabulary of the data.

To generate answers that may be useful to the user, the flexible querying approach poses many different queries over the data. In order to reduce the number of queries posed, Professor Wood investigated the possible impact of discarding queries that were not satisfiable. This was done by the querying system keeping track of the possible paths available in the graph being queried.

Experiments on real data showed that discarding the unsatisfiable queries detected using this approach can result in dramatic improvements in query execution time.

Professor Wood concluded his lecture by mentioning his ongoing and future work, including work on social network applications, generating mappings between distributed Linked Open Data graphs, and further optimisation techniques to speed up the evaluation of queries on graphs.

Click this image to watch Professor Peter Wood’s inaugural lecture on Panopto – Birkbeck’s video platform

Click this image to watch Professor Peter Wood’s inaugural lecture on Panopto – Birkbeck’s video platform

Find out more

Share

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.