Background
Consulting
Publications
Reading
Research
Contact
Home
|
Research
In very broad strokes, my research interests include:
-
Ancient Near Eastern Languages and Texts: Ancient Near
Eastern languages and texts were the original impetus
for my journey into markup languages. It seemed odd to me at
the time (and still does) that the study of such languages and
texts are still largely confined to print media. A notable
exception to that observation is Steve Tinney's work at
University of Pennslyvania, which has resulted in the entire
corpus of Sumerian being encoded for the Sumerian Dictionary
project. If similar work existed for Akkadian, Egyptian, Ugaritic and
other ANE languages, grammatical rules and usage studies could
be based on all the evidence, as opposed to what a single
scholar can read and retain over the course of an academic
career.
-
Bible encoding, analysis and delivery: Computers have been
employed for analysis of the Bible since the 1960's but have
too often resulted in proprietary data sets that are not
commonly available to all scholars. The unfortunate result has
been that the Hebrew Bible, for instance, has been entered,
proofed and re-entered and re-proofed, more than a few times,
with each project starting from ground zero. Such practices
hardly represent a cumulation of scholarship at best and at
worst, are simply duplication of rote effort under the guise
of scholarship.
A common text upon which scholars could add their analysis,
as opposed to simply duplicating the work of others, would be
a starting point for cumulative scholarship. Beyond that,
scholars need tools that allow them to apply their scholarly
knowledge and tools without getting in the way of that
task. Since most biblical scholars already work in several
languages, is seems churlish to insist that they learn markup
syntax or arcane computer languages simply to go about their
common tasks.
-
Collaborative scholarship: An area where biblical
and and ANE scholars have lagged behind their counterparts in
the natural sciences is in collaborative research. It is true
that small groups of scholars may work together on projects,
but routine multi-institution approaches taken as a matter of
course in other disciplines are sorely missing.
It is also unfortunate that biblical scholars have made few
efforts to enlist the aid of those who are not employed as
biblical scholars. Colleges and seminaries graduate far more
people trained in Hebrew and Greek than there are positions
for employment, resulting in a vast pool of talent that is
being ignored by the traditional biblical studies
community.
The WWW has the potential to tap into the talent pool of
non-traditional biblical scholars, whose only lack is
professional employment as biblical scholars. There are any
number of collaboration environments that can be adapted to
utilize that pool of talent.
-
Digitization (both imaging and encoding) of
primary/secondary materials: Part of my interest in
digitization (both senses) stems from my early interest in ANE
studies. I was located over 200 miles from the nearest
research library and specialized materials could be obtained
only with difficulty. That certainly remains the case for
scholars at second tier (an arrogant designation that is
disconnected with quality of teaching or scholarship) and
lower institutions in the United States and even more the case
in developing countries. To say nothing of talented
individuals who are not employed as scholars.
The technologies exist now or are easily adaptable to
make access to primary and secondary materials a matter of
choice and not physical location. Granted that substantial
resources would be required to make everything available, but
the natural sciences have done quite well with, at least
initially, a minimum of resources.
-
Fonts for biblical and Ancient Near Eastern studies:
Rendering of encoded resources for biblical and Ancient Near
Eastern studies has long been problematic. Even with the
advent of Unicode, assuming successful proposals for Akkadian
and Egyptian, there remains the problem of encoding and
displaying the texts as written.
That is in part due to the
absence of the character/glyph distinction that works
relatively well in post-Gutenberg typography, but not in all
cases. The further prior to Gutenberg the composition of the
text, the more likely the distinction is to be
pernicious. Still, some form of universal interchange is
necessary and the weight of full representation of ANE
languages will fall back onto markup systems.
Very complex representations, such as Egyptian tomb
inscriptions, will no doubt require a combination of Unicode,
markup and interchangeable representation formats such as
Scaleable Vector Graphics (SVG) for rendering of texts in
their typical textbook rendition as well as as written.
-
Markup Languages: While I originally became
interested in markup languages while pursuing studies in ANE
languages, I have come to appreciate them as languages in
their own right. It is interesting to note that at a certain
level of abstraction, that the lessons from linguistics,
formal languages (and automata), markup parsers and similar
disciplines all begin to coalesce, despite surface
differences.
-
Overlapping markup: The problem of texts not following
the rather simplistic content models offered by most markup
languages or supported by parsers for those that do offer
support for more complex content models has been my concern
for the past decade or so. There have been any number of
solutions offered, several by Matthew O'Donnell and myself,
but none has ever quite produced an elegant, and widely
accepted, solution to the problem.
This is one of the major unsolved problems for the use of
markup in academic work since representation of a text as it
is seen by the researcher is obviously of more benefit than to
flatten it to conform to arbitrary limitations of good
enough markup languages. (There are a number of
solutions to the overlapping markup problem but suffice
it to say that none has attracted a significant following. All
involve trade-offs of one sort or another.)
If a commercial motivation is necessary, it should be noted that
"solving" the problem of overlapping markup will have an
enormous impact on legal, governmental, publishing and other
areas. Any activity that has changes to its texts, will
benefit in terms of access, cost of production and management
of changes from a solution to this problem.
-
Topic Maps: I became involved in topic maps during the
formation of TopicMaps.Org, an organization that produced an
XML version of ISO 13250. While XTM is certainly one way to
produce topic maps for the WWW, it certainly does not
represent the warp and woof of the topic maps paradigm. It
does have the conveniences of a semantically interoperable
syntax and a simple notion of subject identity, while may be
sufficient for many tasks, but not all.
At the core of the topic maps paradigm is a notion of semantic
integration. That is to say that if you tell me the basis upon
which you have identified subjects and I know the same
information about my subjects, I can meaningfully integrate
information about subjects that you and I have identified
independly of each other. And, if I have taken the trouble to
perform that tasks with my subjects, you can, without
consulting me, achieve the same end. The ultimate result is
that either of us can have all the information about a
particular subject, however either of us identified it, in a
single location.
It is important to note that semantic interoperability, that is the
seamless interchange of information, relies upon having a
common language, like XML or RDF, for the exchange of
information. If we all mean the same thing by car, such
a system works fairly well. Where such systems start to fall
apart, is when the terms are not easy to define or agree upon,
such as democracy or freedom, or even
marriage. (It has been reported to me that EU
automobile manufacturers cannot even agree on general classes
of parts, something that is rather critical for inventory,
ordering and other information systems.)
Enabling semantic integration, the goal of topic maps, is much more
modest. All topic maps request is that the user of a term say
how they identify the subject that is represented by that
term. That does not guarantee that another user will find it
possible to integrate a particular term with the one they use,
but it does enable such a process to take place. Without that
information, semantic integration is by definition
impossible.
|