We are on the move

With some (much, much less) of the trepidation that must have been in the minds of soldiers going over the top in World War I, we are about to move to our new site on April 15. You should find that our URL — http://www.measuringtheanzacs.org/ — will redirect to the new site. As with all the best laid plans, we don’t know quite how this will go …

We have rebuilt the workflow on the new Zooniverse software that is more stable, and continues to be supported. There are some significant changes, but we hope that you will quickly get used to the site.

At first we have just two tasks available: sorting pages, and transcribing History Sheets. These are performed separately, rather than in one integrated workflow. We think that as the database got very large, the integrated workflow was probably slowing down the site.

Unlike the previous site, we are unable to offer support for browsing entire files. This was a feature of our previous Scribe software, but it is not built into the Panoptes “software stack.”

Once we have seen that our History Sheet transcription workflows are working, we will release work flows for the other key pages: Attestations, Medical Attestations, Statements of Service, and Death Notifications.

The initial group of soldiers whose files are represented on the site are a special group. They are men who died during, or shortly after, the war and are remembered on war memorials in the Far North, including Kaitaia.

42631997081_f3af1129d6_o.jpg

We were honored last year to work with historian Kaye Dragicevich and contribute biographies of some of these soldiers to a book about them that will be published soon. As part of this collaboration, University of Minnesota Learning Abroad students working with Measuring the ANZACs researcher Evan Roberts visited Kaitaia and presented their biographies.

42631996401_83ef81111d_o.jpg

There are 2608 pages of files on 120 men represented on the site, and we hope that you will appreciate knowing the special history of this collaboration and what linked these men.

Our new site design will make it much easier for us to collaborate with you, and use Measuring the ANZACs as a platform for quickly transcribing the files of particular groups. If you can provide us with the list of men or women whose files you want transcribed, we can arrange for this data to be included in different batches of uploads. Please be in touch with us at measuringanzacs@umn.edu to discuss the possibilities of collaboration.

 

 

 

A new design and new software

If you’ve been doing any work on the Measuring the ANZACs site, you’ll know it’s been sluggish lately. We know too, and we’re sorry. So we’re re-designing on a new underlying software platform.

Despite the problems, we are going to leave the current site up. What we’ve seen is that if you use it by marking a page, and then selecting “Transcribe this page”, there is some sluggishness; but not nearly as bad as when one goes to Transcribe. There are a class of students in Wellington who might be able to use it in this way later in June 2018. Leaving it up isn’t going to delay the new site, as the underlying software for the project is not being actively maintained.

When we launch the new site, the design will change. The experience of being in the archive flipping through a set of pages will be gone. The first task will be to sort pages. We’ve noticed that we’re not getting enough people hitting the medical attestations, which was really our motivation for the project! We live and learn. To increase the chances people hit the pages we want to see, we’ll be presenting pages in a more random order. This has its pros and cons, we recognize.

Once 3 people have voted that a page falls into a particular category (History, Statement of Services, Attestation, Medical Attestation) it will be passed along to a different workflow to transcribe that kind of page. We think we know all the varied questions on the Attestation now, and can hopefully make the experience there more similar to the other pages, where you select pre-determined questions; rather than typing both question and answer.

The workflows for transcribing each page will be somewhat separate, so at first we’ll launch transcription for the simplest pages. In other words, the Attestations will be identified but may not be immediately available to Transcribe for a period of time. They’ll come eventually.

Another change will be that transcription of each question will likely take place immediately after you’ve selected the box for it. We hope this will mean fewer pages marked up and not finished transcribing.

Designing a citizen science site for transcription of complex documents has a lot of tradeoffs, and we intend the ironic statement that we’re fighting the last war. We’re trying to correct problems from the current site, but can’t fully anticipate the downsides of the new layout. We’ll be active on social media and here to get the word out that the site has re-launched. Thanks to our thousands of transcribers who have helped us gather a significant amount of data, and learn about the challenges of crowd-sourcing the transcription of complex social data.

An overdue Measuring the ANZACs update

It has been too long since we shared a progress update with our loyal transcribing forces. Thank you, as always, for the work that you are doing with us to build towards a complete transcription of New Zealand’s World War I soldiers.

Last year we reported that we had carried out initial assessments of transcription quality, and built our procedures for key fields to allow us to piece together a coherent record of one soldier’s life. The challenge, evident on reflection, is that even if all three transcriptions we have on a person are very similar (even identical) we have to develop a process for saying that. It’s the transcriptions that are just different, but basically saying the same thing that need work to reconcile.

With those processes built, we have begun research projects to scope what we can do with the data. As you know, our goal is to build a very large datasets of tens of thousands of soldier to measure height and weight and health. At the moment we don’t have tens of thousands of complete records (so please keep transcribing, and tell your friends to join in!) so we are taking advantage of the depth of information on a smaller number of records to explore other aspects of soldiers’ lives. 

One particularly interesting aspect that you, our citizen transcribers have noticed, is the misconduct citations. Working in a Sociology department with great criminologists, Evan Roberts  has recruited students to analyze the misconduct citations. As our transcribers will know, what we ask with misconduct citations is that you identify whether or not there’s a citation. So we’ve been checking how accurate those Yes/No answers are. Mostly pretty good. We think a few people are led astray by our instructions that you answer yes if there’s “anything below” the heading Conduct Sheet. Literally speaking, the marriage and children information is below the heading, but we ask people to identify Marriage and Children information in a separate set of marking and transcription steps.

We’ve identified that offenses fall into about six key categories:

  • Absence without leave, or overstaying leave
  • Drunkenness
  • Insubordination or disrespect
  • Disobeying orders
  • Theft or damage of property
  • Other offenses

Punishments came in four general types

  • Deprivation of pay
  • Deprivation of liberties (eg. confined to barracks)
  • Reprimands from officers
  • Physical restraint (Field Punishments #1 and #2)

A fascinating aspect of the relationship between offenses and punishment is what happens if someone is drunk as well as committing other offences? Does the drunkenness intensify the punishment, because it’s another thing done wrong, or does the drunkenness, in a sense, excuse and explain why someone damaged property or was out late? Our initial analyses suggest that the excuse and explanation perspective was more common. We can also see through this analysis, paired with the data you have transcribed how misconduct fit into careers. Soldiers who committed misconduct while still in the lower ranks seemed to have a harder time getting promoted as one would expect. However, soldiers already promoted to the lower officer ranks and then committing misconduct seemed to have innoculated themselves from getting demoted. The initial promotion showed they were worthy, and minor misconduct did not hold them back.

A new and re-designed site: We are excited to share with you plans for a new and re-designed site. We will be introducing what we hope is a streamlined process. One thing that will likely be lost in this new design is the ability to browse through all 30 or more pages of a long file. But by doing this, we hope that people will get all the way through shorter segments of files, of no more than 10 pages.

The re-design will start with a task sequence that just asks people to sort pages into our four types, plus “Other”. Once a page has been voted into a page type, it will be available for marking and transcription. We are going to try and take advantage of your work transcribing all the question test for the attestations by having pre-specified items on both pages of the attestation as well, now that we know all the questions. There will still be an option to transcribe a question and answer, if necessary.

We hope that the new site may be up and running by the end of summer, and we’ll let our loyal transcribers help us out with testing. The new site will allow us to collaborate better with other researchers who want to upload and transcribe a particular set of men’s files. Currently this is clunky at best. Stay tuned, tell your friends to join us, and please keep transcribing!

 

Rare but routine (or vice versa)

As you know we’re not transcribing all the documents in the Measuring the ANZACs files. We’re concentrating on the pages that give our research group, and other researchers including family historians, the best return on our collective time (thank you for transcribing, we really appreciate it).

The documents we’re transcribing are routine, they’re found in nearly every file. Excepting the South African files, which have a structure all of their own, the most common set of documents in the files are the History Sheets which come in two forms.

This is the most commonScreen Shot 2017-04-04 at 11.25.40 AM.png

but you also see ones like this

Screen Shot 2017-04-04 at 11.29.49 AM.png

The numbering suggests that the top version is the later and more widely used one.

In the same way we have two versions of the Statement of Services, one like this (the later version)

Screen Shot 2017-04-04 at 11.29.39 AM.png

and another like this from earlier in the war

Screen Shot 2017-04-04 at 11.31.54 AM.png

Nearly all files have an attestation (and sometimes two copies), but they don’t tend to be found in officers and nurses files.

Active casualty forms are found in lots of files because so many of the New Zealand Expeditionary Force were killed or wounded (60% if you believe the contemporary calculations and reportsScreen Shot 2017-04-04 at 11.33.31 AM.png

What’s interesting to us are the forms that seem routine, but don’t show up in many files.

One of our transcribing forces, 141Dial34 who’s become a community moderator found this document recently. It’s a report on casualties issued in October 1916. Unremarkable in many ways, but look at the second row. It’s Casualty List No. 422. Lists are just ordinal, and so could start at any number. But typically people who are not mathematicians start lists at 1. FL23249884.jpg

Why don’t we see these lists in more files? One reason would be that they’re duplicative when they appear in individual files. Our partners at Archives New Zealand appear to have a collection of more of these listsScreen Shot 2017-04-04 at 11.41.30 AM.pngHere’s another example just posted today. It’s an age declaration. We know that about 10% of the men who tried to enlist were “adjusting” their, typically two years upwards. This looks like a mimeographed piece of text (the purple lettering is a giveaway), so this was probably done multiple times. Why don’t we see more of them in the files?

Screen Shot 2017-04-04 at 11.42.26 AM.png

If you’ve worked with archives you’ll know that not everything is saved. Some pieces of paper are thrown out because they’re routine, or they are seen to have little value. The survival of some of these documents in some files hints at the culling that’s occurred, and what might have been the full record.

If you find something different, use the Discuss this personnel record link to bring it to Talk. It builds our Measuring the ANZACs community, introduces you to others who are transcribing, and helps research. We’ve started looking at the POWs after some examples were brought to our attention on Talk. We also love to share your finds with a wider audience on social media. Your chance at fame.

As always, thanks for all your contributions to the Measuring the ANZACs community and effort.

Evan Roberts (eroberts [at] umn dot edu)

Who’s doing all the transcriptions?

We reported last week that we were making great(er) progress with nearly three quarters of a million fields transcribed after 16 months. Another thing that’s interesting is to look at who is doing the work?

An important thing to know is that if you’re not logged in we just record you as an anonymous user. Just over 100,000 of our transcriptions were like that. We suspect there are some repeat visitors in there, and we hope you’ll register!

Like a lot of citizen science projects a lot of the work is being done by a small number of people. We had 6,670 registered users on the site, and just 80 people did more than half the work; starting with the top transcriber with 60,000 fields transcribed. In case this sounds unreasonable it’s the equivalent of someone who joined us when we launched and has transcribed one soldier a day … It’s great but might be only taking them 15-30 minutes a day. We need dozens more like them!

On the other end of the scale there are more than 1000 registered users who did just one transcription. Most likely—since they’re registered—they’ve come over from other Zooniverse projects, and didn’t stick around.

You can visualize this in a couple of ways. Because the numbers are so skewed we take logarithms (remember your high school or college mathematics!) to make the graphs more legible.

This is not at all unusual to Measuring the ANZACs. Citizen science participation follows a “power law”  and our project is different from galaxies and penguins just in its content.

user_ranks

user_ranks_log_log

Evan Roberts (eroberts [at] umn dot edu)

 

We’re making even quicker progress!

In our first year we had 545,000 fields transcribed (a field is a single box that you enter text into). Just four months later we received another batch of data, and we’ve now got 761,165 fields transcribed.

Lets put that in perspective — 40% more data has been transcribed in just over 4 months, so we’re doing even more than we were achieving in the last few months of 2016.

One of the best ways to see this is look at the weekly rolling average of daily transcriptions. This helps smooth out the day-to-day variation but also see short run changes and trends. These help us keep an eye on how we’re engaging the community. If you’re not coming back to help tell the stories of these soldier’s that’s a worry!

rolling_average Feb 2016 to March 2017 _Tuesday, 14 March 2017.png

You can see that we kept up the steady pace of transcriptions we’d started achieving from August 2016, and what’s really encouraging is that people took a break for Christmas and then came right back to it during January. This is great, and gives us some quantitative evidence that we’re building a good community of people transcribing with us. We had a class working on the site in early February (and we’ve taken out the spike right before they had an assignment due!) which has helped us out.

On average we’re getting through 1600 fields transcribed a day. This is great, and is the equivalent of about 6 soldiers’ files being completed each day. One thing we’ve noticed is that the History Sheets, which appear first, are the most worked over. We really need people to work through a whole file if they can. Eventually History Sheets will be retired when everything has been marked once and transcribed three times. But with the incredible levels of accuracy you’re achieving we can make great use of the first version of a transcription. So the more our transcribing forces can spread their effort through a file the better!

Thanks for all that you do! Lets keep going and the more you can recruit others to join the Measuring the ANZACs forces the quicker we’ll complete our journey.

Evan Roberts

 

Service dates tell us a lot.

I’ve been using the Measuring the ANZACs platform in my Sociology of Health and Illness class this semester at the University of Minnesota. Teaching young adults in the Midwest why soldiers’ records from early twentieth century New Zealand are relevant to modern understanding of health and illness has helped me think more deeply about the material we’re working with.

For example, a big research question in the social scientific study of health is how do stress and trauma (different, but related things) affect our lives? How do our social connections help us survive bad things. In the long term as we assemble more and more individual records we’ll also have data on the units men were in, and can investigate the experience of what happened to men who served together.

For now, we can do a lot with simple information. Take service dates which are summarized on the History Sheet (check out the Field Guide if you’re new to the Measuring the ANZACs forces). We can get an amazing amount of useful information about what affected men’s health and life from these four lines.

Screen Shot 2017-03-03 at 11.10.38 AM.png

The first line is very similar for a lot men, a few weeks or months training in NZ. Later in the war we think we’re starting to see men who waited longer to get to war. They will be an interesting group to compare to men who were unlucky enough to get there earlier.
The second line, “Foreign” is a measure of exposure to war which presumably is worse than kicking around camp in New Zealand where the weather is mild and you’re not in a trench. Although we only get numbers and not details off this sheet, numbers are a great starting point. Was it bad just to step into the war zone, or did time matter? What was the difference between a week, a month, a year, or several years of foreign service? How did time matter?
Finally, the third and fourth lines identify another important health experience — when, if at all, were men exposed to the influenza epidemic. It isn’t listed as such, but we know when the flu was in particular places from other sources and we can map this information into the men’s files.
The key thing here is that service dates tell us a lot about a man’s war story, but they are also the starting point for some research questions that are still important today: how does stress accumulate over time to affect men’s (and women’s) lives? Help us answer these questions by getting the service dates in the correct format: one box for each line.
As always, if you have any questions stop by Talk to meet the researchers and other members of the Measuring the ANZACs forces.