Themes

To my experience in Digital Humanities the recurring themes are from two kinds: the one about the subject, the meta ones, and the one about its insides.

From the first kind I see a lot of similarities to the themes in psychology: what is this study? Is a science, a humanitarian discipline? What is the purpose, who does it? These type of meta questions are common to all newly defined disciplines. Establishing the work of the field is important given the fact that it is taking new leaps now as hardware also persistently develops. Using computers to map out texts is in the basis of teaching neuron networks to create text themselves and to understand more complex ideas. Making precise history maps and digitising information will also do that. So, no matter what the exact meta is it is definitely an useful area of study and should be established as one.

The other type of recurring themes are inevitable to mention when speaking about digital humanities: data, metadata, mappings, crowd projects. In the archaeological project that I participated in with my professor and a bunch of archaeological scientist all of these themes made an appearance. Starting from gathering data since 2005 the archaeologists studying Saadiyat Island discovered the many heaths on the island showing that life here existed for millenniums. Then they put all of the information together to form a map of the place, which was being advanced by 3D mapping devices that could take data from the ground and create elaborate pictures including heigh. All of the people who participated in the trip also participated in the crow sourcing part of the project. We all contributed to make a perfect map, with location coordinates, of all of the objects that were observable on the site. This included not only historical items, but such that became history as we documented them.

Data

It comes in many shapes and forms, objects, pictures, text, etc. but really what concerns humanitarians is in the form of 1 and 0, just like the data that interests programmers. But our data begins its way in a very different, real format. All information created in the humanities can be turned into computational data. The works of some poet can be translated into computer language and analyzed for their contents or to teach a computer something. The paintings or an artist can be digitalized for a machine to count the strokes. A historian’s works can be compared to the ones of people long gone. These are just a few examples of data in the humanities.
It can be stored in many ways, or formats, depending on the need of it. In the need of creating networks, for example, the extensions .cvs or .xvs are a good, easy way to keep data organised and easily accessible with other software (Google Sheets, MicroSoft Excel, etc). A extension that keeps all of the digital information in a picture is .raw, however it is quite big.
On the topic of big things, data sets can be of small, medium or big sizes. Small data is easy to navigate and is even workable by hand. Medium data requires a lot of input and a software in order for processing to happen. Lastly, there is big data: it is gathered from many sources, and analyzing it even with softwares usually takes a substantial amount of time. It is really hard to work with because mistakes are hard to find, details might be left out, so it is really dependent on the software that is working with it. A famous example of a failure with big data is Google’s Flu predictor, which failed numerous amounts of times due to misinformation or because it didn’t take into account the metadata of the information it had (ex. time of the year, which correlates with the spread of the flu).

Metadata

This is the background information generated not directly by users but by devices. I believe the most relevant to making humanities data collection is location. Every message or photo, or checkin has coordinates attached to it with the particular geo location the phone estimates based on GPS satellites – these are hardly wrong by the minute. So, without the need to manually input locations, search for pins on maps, one can just use this to map anything. In the humanities this can be used for projects that gather information about anything that is currently available to be photographed – just like we did in the archeological project.

But metadata can be added manually as well, which makes it useful not only for location but for sorting information by anything inputed as metadata. Say, in a play one can ask many questions that cannot be answered always from the pure text, like: whom does the line address, where in space was it said, where in stage space was it said, etc. There types of details can also be contained in the metadata of a play, in order to help both literates and even actors work with the play. In fact, intonation, accent, facial expression, all kinds of details can be added to a metadata bank of a file. Possibly, in the future in plays where the actor’s interpretation of a character or situation is not considered important, he can even learn a play on his own only with information from metadata. All directors beware!

Wait, if there is so much information behind the scenes of electronic files, then how do I know what am I giving away? And who has access to this metadata again? Well, this is tangent to one of the biggest privacy issues of 21st century. More info – Metadata.

Crowdsourcing

This is the miracle of the accessible Internet. People are on it on the time, they generate data and metadata of all sorts, and they can be useful to computers to analyize it. The simplest example we looked at was a site where people had to sort images into categories in order to help a machine to learn to differentiate them. This task is impossible for softwares to do properly and subjectively the way humans would do it, but it is not very important as to assign people to do it for a pay check (in this case). So, only asking the people of the Internet to assist, one can accomplish a lot of things – for free. Users accumulate great amounts of knowledge about the world and sharing it to people who can work with it is always great. Crowdsourcing doesn’t only have to be online, it can easily be in real life as well, as in the example with the sample archeology class. However, the information gathered always ends up accumulated in a digital source in order to be easy to work with, so making the user input it digitally to begin with is even easier.
In class we are working on a project concerning the food places around Abu Dhabi and particularly, if they can serve well to NYUAD students (prices, closeness to campus, delivery, etc.). In this project we are working as a team to input data, but we are also allowing other people to input information about their food adventures in the city, just as they go along without any type of incentive – crowdsourcing. I find it amusing how well Google’s platform GoogleForms works for such endeavours, as it is easy to work with, not time consuming to create and has enough options.

Crowdsourcing

With the internet available to a big part of the human population and with its size bigger than ever before crowdsourcing has become a great way for gathering data and information. The work to be done with the grandness of the Internet is a lot and counting in the new social and psychological sciences to gather data directly from the people without too much involvement is a great way to get things done. Wikipedia is the biggest example of such a project. Everyone uses it as a source of information when wondering and it is all the creation of users putting in information. Of course, a certain amount of editing must be done but it is a lot easier to do once you have a page written out.

It works for many things: open source programs, digitalisation of text, making studies, informing about different places and events all around Earth. Crowdsourcing lets anyone become a part of something bigger than themselves and accomplish more as a part of humanity.

In the case of making text inputs and working with digital texts as a whole, “digital humanities”, crowdsourcing is a very good way to increase the worldwide heritage. With so many texts written before the option for spreading them with super fast speed to all corners of the Earth, there is a great need for help, that doesn’t require any skills. Checking texts for grammatical mistakes, made by softwares, scanning old books, etc. is easy for everyone and there is a need for it.

Here in Abu Dhabi there is a great language barrier, as Arabic is hardly read by computers and many mistakes are found. Also, in the context of NYUAD we have a grand library collection of Ancient arabic texts that most of the world hasn’t yet seen. Making a catalog of these texts would be one way to increase their availability and it can be done using crowdsourcing from NYUAD. On the other hand, having many students fluent in Arabic we can even transfer these texts in a digital format that is accurate and useful to scholars all around the world.

This work would be hard and long, and a great pressure only to the people who know Arabic so using the best part of crowdsourcing – that it is free – will not be an option for this project. Some sort of encouragement should be made in order to find enough participants but even then the numbers will be small and the chance of students carrying out the workload is small. This is one of the biggest problems projects like this, that have a limited target of crowd, face. So instead we can work out something that would help the Western world to get to know the Arabic world, instead of only those interested in Arabic already.

Another suggestion for a crowdsourcing project that can be carried out in the UAE, and with the special help of NYUAD is gathering a “yellow pages” for Abu Dhabi. A lot of tourist guides are posted on the internet but for the UAE they don’t include sufficient information and not many people are doing them. The one option is expats living here, but they have to attend to their lives and work so they don’t have enough time to explore, another is journalists who only come to visit, and the last one, least likely is natives who live here. This is a country with a very closed society so having some way to understand more about it, presented in a pleasant, user-friendly manner, would be a great project.

NYUAD students, and other students who live here because they go to international universities are the best option for crowdsourcing on such a project: we have a lot of free time and travel around a lot, we try to find different forms of entertainment and we also have different perspectives on what is  “fun”, “interesting”, or “cheap”, which will allow for a greater amount of audience to gather. Sharing one’s adventures is a good motivation for people to participate, because people love to talk about themselves. If there is an unified form for presenting information about plans, rating it and etc. it will also be easily categorised (in fact as it is inputted). Such project may face problems of editing because sometimes young people can be too impulsive and write things that are not accurate, so just like Wikipedia there will be a need for editors who will not only have to check the entries but also if possible even visit the places to give a more informed statement on a place, event or series of events.