In March 2011 I defended my PhD thesis entitled
"Model-based spreadsheet engineering". Fortunately, I did it with
success! I was supervised by João Saraiva and
Joost Visser.
Currently, I am a post-doctoral researcher at
University of Minho and
Oregon State University working
with João Saraiva and
Martin Erwig.
I am also a member of the
SSaaPP project.
My research is mainly focused on applying techniques from modern programming
languages to spreadsheets. I also use MDE techniques to improve spreadsheets.
I am starting to do some work with bidirectional transformations.
Before, I graduated on
Mathematics and Computer Sciences.
I was in the organizing committee of GTTSE '11,
as responsible for local arrangements and registration.
I am the co-organizer of the HASLab Seminar Series
in collaboration with José Creissac Campos.
Please, contact us if you would like
to participate in the seminars or if you would like to give a talk.
(top)
In this paper, we present MDSheet, a framework for the embedding, evolution and inference of spreadsheet models. This framework offers a model-driven software development mechanism for spreadsheet users.
@inproceedings{icse12,
Author = {Jácome Cunha and João Paulo Fernandes and Jorge Mendes and João Saraiva},
Booktitle = {Proceedings of the 34rd International Conference on Software Engineering},
Note = {to appear.},
Publisher = {ACM},
Series = {ICSE'12},
Title = {{MDSheet: A Framework for Model-driven Spreadsheet Engineering}},
Year = {2012}}
Spreadsheets are among the most popular programming languages in the world. Unfortunately, spreadsheet systems were not tailored from scratch with modern programming language features that guarantee, as much as possible, program correctness. As a consequence, spreadsheets are populated with unacceptable amounts of errors.
In other programming language settings, model-based approaches have been proposed to increase productivity and program effectiveness. Within spreadsheets, this approach has also been followed, namely by ClassSheets. In this paper, we propose an extension to ClassSheets to allow the specification of spreadsheets that can be viewed as relational databases. Moreover, we present a transformation from ClassSheet models to UML class diagrams enriched with OCL constraints. This brings to the spreadsheet realm the entire paraphernalia of model validation techniques that are available for UML.
@inproceedings{sac-se12,
Author = {Cunha, J\'{a}come and Fernandes, Jo{\~a}o Paulo and Saraiva, Jo\~{a}o},
Booktitle = {In the Software Engineering Track at the 27th Annual ACM Symposium On Applied Computing (SAC 2012)},
Note = {to appear},
Publisher = {ACM},
Title = {From Relational ClassSheets to UML+OCL},
Year = {2012}}
This paper describes the embedding of ClassSheet models in spreadsheet systems. ClassSheet models are well- known and describe the business logic of spreadsheet data. We embed this domain specific model representation on the (general purpose) spreadsheet system it models. By defining such an embedding, we provide end users a model-driven engineering spreadsheet developing environment. End users can interact with both the model and the spreadsheet data in the same environment. Moreover, we use advanced techniques to evolve spreadsheets and models and to have them synchronized. In this paper we present our work on extending a widely used spread- sheet system with such a model-driven spreadsheet engineering environment.
@inproceedings{Cunha:2011,
Address = {Washington, DC, USA},
Author = {Cunha, J\'{a}come and Mendes, Jorge and Fernandes, Jo\~{a}o Paulo and Saraiva, Jo\~{a}o},
Booktitle = {Proceedings of the 2011 IEEE Symposium on Visual Languages and Human-Centric Computing},
Keywords = {Spreadsheets, Embedding ClassSheets},
Numpages = {8},
Pages = {186--201},
Publisher = {IEEE Computer Society},
Series = {VLHCC '11},
Title = {Embedding and Evolution of Spreadsheet Models in Spreadsheet Systems},
Year = {2011}}
Abstract: Spreadsheets are widely used, and studies have shown that most end-user spreadsheets contain non-trivial errors. To improve end-users productivity, recent research proposes the use of a model-driven engineering approach to spreadsheets.
In this paper we conduct the first systematic empirical study to assess the effectiveness and efficiency of this approach. A set of spreadsheet end users worked with two different model-based spreadsheets, and we present and analyze here the results achieved.
@inproceedings{eusprig11,
Author = {Laura Beckwith and J{\'a}come Cunha and Jo{\~a}o Paulo Fernandes and Jo{\~a}o Saraiva},
Booktitle = {Proceedings of the European Spreadsheet Risks Interest Group},
Editor = {Simon Thorne and Grenville Croll},
Isbn = {978-0-9566256-9-4},
Month = {July},
Pages = {87--100},
Series = {EuSpRIG '11},
Title = {An Empirical Study on End-users Productivity Using Model-based Spreadsheets},
Year = {2011}}
Abstract: Spreadsheets are widely used and studies show that most of the existing ones contain non-trivial errors. To improve end-users productivity, recent research proposes the use of a model-driven engineering approach to spreadsheets. In this paper we conduct the first empirical study to assess the effectiveness and efficiency of this approach. A set of spreadsheet end users worked with two different model-based spreadsheets. We present and analyze here the results achieved.
@inproceedings{isedu11,
Author = {Laura Beckwith and J\'{a}come Cunha and Jo\~{a}o Paulo Fernandes and Joo Saraiva},
Booktitle = {Proceedings of the Third International Symposium on End-User Development},
Pages = {282--288},
Series = {IS-EUD '11},
Title = {End-users Productivity in Model-based Spreadsheets: {A}n Empirical Study},
Year = {2011}}
Type-safe Evolution of Spreadsheets
(PDF)
Jácome Cunha, Joost Visser,Tiago Alves, João Saraiva,
In FASE '11: Proceedings of the International Conference on Fundamental Approaches to Software Engineering: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2011, Berlin, Heidelberg, 2011. Springer-Verlag. pages 186-201.
Abstract: Spreadsheets are notoriously error-prone. To help avoid the introduction of
errors when changing spreadsheets, models that capture the structure and interdependencies of
spreadsheets at a conceptual level have been proposed. Thus, spreadsheet evolution can be made
safe within the confines of a model.
As in any other model/instance setting, evolution may not
only require changes at the instance level but also at the model level. When model changes are
required, the safety of instance evolution can not be guarded by the model alone.
We have designed an appropriate representation of spreadsheet models, including the fundamental
notions of formula and references. For these models and their instances, we have designed coupled
transformation rules that cover specific spreadsheet evolution steps, such as the insertion of
columns in all occurrences of a repeated block of cells. Each model-level transformation rule is
coupled with instance level migration rules from the source to the target model and vice versa.
These coupled rules can be composed to create compound transformations at the model level
inducing compound transformations at the instance level. This approach guarantees safe evolution
of spreadsheets even when models change.
@inproceedings{Cunha:2011:TES:1987434.1987453,
author = {Cunha, J\'{a}come and Visser, Joost and Alves, Tiago and Saraiva, Jo\~{a}o},
title = {Type-safe evolution of spreadsheets},
booktitle = {Proceedings of the 14th international conference on Fundamental approaches to software engineering: part of the joint European conferences on theory and practice of software},
series = {FASE'11/ETAPS'11},
year = {2011},
isbn = {978-3-642-19810-6},
location = {Saarbrücken, Germany},
pages = {186--201},
numpages = {16},
url = {http://dl.acm.org/citation.cfm?id=1987434.1987453},
acmid = {1987453},
publisher = {Springer-Verlag},
address = {Berlin, Heidelberg},
}
Abstract: Spreadsheets are widely used by end users, and studies have shown that most end-user spreadsheets contain non-trivial errors. To improve end users productivity, recent research proposes the use of a model-driven engineering approach to spreadsheets. In this paper we conduct the first systematic empirical study to assess the effectiveness and efficiency of this approach. A set of spreadsheet end users worked with two different model-based spreadsheets, and we present and analyze the results achieved.
Abstract: Spreadsheets are notoriously error-prone. To help avoid the introduction of errors when changing spreadsheets, models that capture the structure and interdependencies of spreadsheets at a conceptual level have been proposed. Thus, spreadsheet evolution can be made safe within the confines of a model.
As in any other model/instance setting, evolution may not only require changes at the instance level but also at the model level. When model changes are required, the safety of instance evolution can not be guarded by the model alone.
Coupled transformation of models and instances are supported by the 2LT platform and have been applied for transformation of algebraic datatypes, XML schemas, and relational database models.
We have extended 2LT to spreadsheet evolution. We have designed an appropriate representation of spreadsheet models, including the fundamental notions of formula, references, and blocks of cells. For these models and their instances, we have designed coupled transformation rules that cover specific spreadsheet evolution steps, such as extraction of a block of cells into a separate sheet or insertion of columns in all occurrences of a repeated block of cells. Each model- level transformation rule is coupled with instance level migration rules from the source to the target model and vice versa.
These coupled rules can be composed to create compound transformations at the model level that induce compound transformations at the instance level. With this approach, spreadsheet evolution can be made safe, even when model changes are involved.
Abstract: Many errors in spreadsheet formulas can be avoided if spreadsheets are built
automatically from higher-level models that can encode and enforce consistency
constraints. However, designing such models is time consuming and requires
expertise beyond the knowledge to work with spreadsheets. Legacy spreadsheets
pose a particular challenge to the approach of controlling spreadsheet
evolution through higher-level models, because the need for a model might be
overshadowed by two problems: (A) The benefit of creating a spreadsheet is
lacking since the legacy spreadsheet already exists, and (B) existing data
must be transferred into the new model-generated spreadsheet.
To address these problems and to support the model-driven spreadsheet
engineering approach, we have developed a tool that can automatically infer
ClassSheet models from spreadsheets. To this end, we have adapted a method to
infer entity/relationship models from relational database to the
spreadsheets/ClassSheets realm. We have implemented our techniques in the
HaExcel framework and integrated it with the ViTSL/Gencel spreadsheet
generator, which allows the automatic generation of refactored spreadsheets
from the inferred ClassSheet model. The resulting spreadsheet guides further
changes and provably safeguards the spreadsheet against a large class of
formula errors. The developed tool is a significant contribution to
spreadsheet (reverse) engineering, because it fills an important gap and
allows a promising design method (ClassSheets) to be applied to a huge
collection of legacy spreadsheets with minimal effort.
@inproceedings{Cunha:2010:AIC:1915084.1916358,
author = {Cunha, J\'{a}come and Erwig, Martin and Saraiva, Jo\~{a}o},
title = {Automatically Inferring ClassSheet Models from Spreadsheets},
booktitle = {Proceedings of the 2010 IEEE Symposium on Visual Languages and Human-Centric Computing},
series = {VLHCC '10},
year = {2010},
isbn = {978-0-7695-4206-5},
pages = {93--100},
numpages = {8},
url = {http://dx.doi.org/10.1109/VLHCC.2010.22},
doi = {http://dx.doi.org/10.1109/VLHCC.2010.22},
acmid = {1916358},
publisher = {IEEE Computer Society},
address = {Washington, DC, USA},
keywords = {Spreadsheets, ClassSheets inference},
}
Abstract: Spreadsheets can be viewed as a highly flexible end-
users programming environment which enjoys wide-spread
adoption. But spreadsheets lack many of the structured programming
concepts of regular programming paradigms. In
particular, the lack of data structures in spreadsheets may
lead spreadsheet users to cause redundancy, loss, or corruption
of data during edit actions.
In this paper, we demonstrate how implicit structural
properties of spreadsheet data can be exploited to offer edit
assistance to spreadsheet users. Our approach is based
on the discovery of functional dependencies among data
items which allow automatic reconstruction of a relational
database schema. From this schema, new formulas and visual
objects are embedded into the spreadsheet to offer features
for auto-completion, guarded deletion, and controlled
insertion. Schema discovery and spreadsheet enhancement
are carried out automatically in the background and do not
disturb normal user experience.
@inproceedings{Cunha:2009:DEA:1685992.1686072,
author = {Cunha, J\'{a}come and Saraiva, Jo\~{a}o and Visser, Joost},
title = {Discovery-based edit assistance for spreadsheets},
booktitle = {Proceedings of the 2009 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)},
series = {VLHCC '09},
year = {2009},
isbn = {978-1-4244-4876-0},
pages = {233--237},
numpages = {5},
url = {http://dx.doi.org/10.1109/VLHCC.2009.5295255},
doi = {http://dx.doi.org/10.1109/VLHCC.2009.5295255},
acmid = {1686072},
publisher = {IEEE Computer Society},
address = {Washington, DC, USA},
}
Jácome Cunha, João Saraiva, Joost Visser PEPM '09 Proceedings of the 2009 ACM SIGPLAN workshop on Partial evaluation and program manipulation, 2009
Abstract: This paper presents techniques and tools to transform spreadsheets
into relational databases and back. A set of data refinement rules is
introduced to map a tabular datatype into a relational database schema.
Having expressed the transformation of the two data models as data refinements,
we obtain for free the functions that migrate the data. We
use well-known relational database techniques to optimize and query the
data. Because data refinements define bidirectional transformations we
can map such database back to an optimized spreadsheet.
We have implemented the data refinement rules and we constructed
Haskell-based tools to manipulate, optimize and refactor Excel-like
spreadsheets.
@inproceedings{Cunha:2009:SRD:1480945.1480972,
author = {Cunha, J\'{a}come and Saraiva, Jo\~{a}o and Visser, Joost},
title = {From spreadsheets to relational databases and back},
booktitle = {Proceedings of the 2009 ACM SIGPLAN workshop on Partial evaluation and program manipulation},
series = {PEPM '09},
year = {2009},
isbn = {978-1-60558-327-3},
location = {Savannah, GA, USA},
pages = {179--188},
numpages = {10},
url = {http://doi.acm.org/10.1145/1480945.1480972},
doi = {http://doi.acm.org/10.1145/1480945.1480972},
acmid = {1480972},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {bi-directional transformations, functional programming, spreadsheets, type-safe data migration},
}
Luís Barbosa, Jácome Cunha, Joost Visser SYANCO '07 International workshop on Synthesis and analysis of component connectors: in conjunction with the 6th ESEC/FSE joint meeting, 2007
Abstract: Algebraic theories for modelling components and their interactions offer
abstraction over the specifics of component states and interfaces. For example,
such theories deal with forms of sequential composition of two components in a
manner independent of the type of data stored in the states of the components,
and independent of the number and types of methods offered by the interfaces of
the combinators. General purpose programming languages do not offer this level
of abstraction, which implies that a gap must be bridged when turning component
models into implementations. In this paper, we present an approach to prototyping
of component-based systems that employs so-called type-level programming
(or compile-time computation) to bridge the gap between abstract component
models and their type-safe implementation in a functional programming language.
We demonstrate our approach using Barbosa's model of components as generalised
Mealy machines. For this model, we develop a combinator library in Haskell, which
uses type-level programming with two effects. Firstly, wiring between components
is computed during compilation. Secondly, the well-formedness of the component
compositions is guarded by Haskell's strong type system.
@inproceedings{Barbosa:2007:TAC:1294917.1294920,
author = {Barbosa, Lu\'{\i}s and Cunha, J\'{a}come and Visser, Joost},
title = {A type-level approach to component prototyping},
booktitle = {International workshop on Synthesis and analysis of component connectors: in conjunction with the 6th ESEC/FSE joint meeting},
series = {SYANCO '07},
year = {2007},
isbn = {978-1-59593-720-9},
location = {Dubrovnik, Croatia},
pages = {23--36},
numpages = {14},
url = {http://doi.acm.org/10.1145/1294917.1294920},
doi = {http://doi.acm.org/10.1145/1294917.1294920},
acmid = {1294920},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {Haskell, coalgebra, combinator library, mealy machine, type-level programming},
}
HaExcel is a framework to manipulate, transform and query spreadsheets.
It is implemented in Haskell
and consists of the following parts:
Library A generic/reusable library to map spreadsheets into relational database
models and back: This library contains an algebraic data type to model a
(generic) spreadsheet and functions to transform it into a relational model and
vice versa. Such functions are implemented as refinement rules.
The library includes two code generator functions: one that produces
the SQL code to create and populate the database, and a function that
generates Excel/Gnumeric code to map the database back into a spreadsheet. A
MySQL database can also be created and manipulated using this library under
HaskellDB
Front-ends A front-end to read spreadsheets in the
Excel and Gnumeric formats: The front-end reads spreadsheets
in the portable XML documents using
the UMinho Haskell Libraries. We reuse the spatial logic algorithms from the
UCheck
project to discover the tables stored in the spreadsheet. The first
row of each table is used as labels and the remaining elements are assumed to
be data.
Tools Two spreadsheet tools: A batch and a
online tool t
hat allow the users to
read, transform, refactor and query spreadsheets.
To install it, just double click it and OpenOffice
will open a dialog to help you installing it.
Note that you need to have OpenOffice installed.
This is a platform independent addon.
The back end is platform dependent. Its sources can be found
here.
Note that this is a Subversion repository.
To download it you can type "svn co http://haskell.di.uminho.pt/websvn/HaExcel/ FOLDER_TO_STORE_IT".
To construct it just type "make ghc". It has been tested in GHC 6.8.* and in GHC 6.10.*.
A version to Mac OS X (Leopard) can be found here.
After compiling it, you need to put it in a place that the addon knows about.
In this case this place is stored in the OpenOffice path variable "My Documents".
To see its value go to the OpenOffice preferences, then to the OpenOffice.org
tab, and then click in the Paths tab. There you can find the variable's value.
To know more about it, please see the publications bellow.