41 Matching Annotations
  1. Apr 2019
  2. Dec 2018
    1. I sympathise with the frustration over getting things to work on different systems. It's annoying in class but even worse for workshops. Outside of academia, time is allotted for it, and it's considered part of doing business.

      But you do seem to have picked up the big for employing "screwmeneutics", which will allow you to get down and see what you can do. And I think the first step is believing that you really can learn to do what you want by googling how to do it!

    1. I think this deserves a longer annotation. I suspect that the "let's get on with it" sentiment, as attractive as it is, may be unrealistic for a couple of reasons First, the tools being leveraged are not uniquely employed by traditional humanities disciplines. When the topic modelling community consists of social scientists and search engine creators, there will inevitably be talk about how they can be applied within the context of literary criticism. Second, the lit crit field itself has passed through a period in which those who do not examine their methods critically are stamped as committers of the most heinous intellectual (and political) crimes. When digital humanists engage in such introspection, however, the terms in which they do so are so foreign that those who follow more traditional paradigms tend to have knee-jerk reactions. So the problem will probably never go away. But you are right that we can get on with our own research and satisfy ourselves that we have learnt something. Ultimately, achieving a critical mass of people who can use DH methods to do so will quiet down some of the angst.

      As an added note, much of the defensiveness and polemic dates back five years or so. It's worthwhile looking at Andrew Piper's Enumerations to see the difference in tone in 2018.

    1. Set

      What happened in March to cause a huge explosion of tweets. NEH grant announcements?

    2. UCLA has a handy page here which documents this fact in great detail for specific majors

      It would be useful to have a link to this page.

  3. Nov 2018
    1. list

      One possible approach is to maintain several stopword lists and compare the results of analysis using each list. You can then evaluate which results seem to be the most stable.

    1. Environmental

      This section could use some reference to existing efforts to collect or analyse the data sources you are working with.

    1. study

      This might be a good place to put the time frame. Perhaps suggest the past year, which should be substantial enough for topic modelling, and say that we would be happy to receive a larger historical archive if it is available.

    2. Practically speaking this data would take the form of transcripts from your news broadcasts.

      I seem to recall your saying that you had come across information that transcripts might be available. It would be a good idea to mention this hear so that they know what possible sources you are thinking about.

  4. Oct 2018
    1. we have no idea where the content landed

      This is a perennial problem. One way around it is to place milestone tags in your text and cut on the milestones. If you cut on the number of words, there are also ways to determine where segments begin and end. You can also download the cut segments and open them in an editor.

    1. Lexos project regarding Beowulf

      You would need to have a look at Beowulf Unlocked. Here's a review. Also this article as a starting point..

    1. willing to move to perhaps undesirable or perhaps less populous states

      This probably refers specifically to tenure-track academic jobs. Typically, a research university will have three medievalists and a comprehensive university or liberal arts college will have one. If your the institutions in your region have these positions filled, there will be no jobs in your region, and you will have to look elsewhere.

    2. Outside of the field, there is a sense that at worst the humanities have become a propaganda tool for elitist politics or at best that they contribute nothing to the world.

      It may be striking that the sense that "grievance studies" are empowering for those with "grievances" is either not reaching or not resonating with the public. Or else it is with a portion of the public but driving them into niches.

    1. Schmidt himself bounces back and forth between a more capacious definition, one which includes ethnic and gender studies, and a more conservative definition (philosophy, history, languages, and English) which seems to serve as the focus of most of his analysis.

      One of the challenges is in identifying how closely ethnic and gender studies should be linked to the term "humanities" since the institutional units in which they are often situated are frequently interdisciplinary. Of course, you could say the same thing about Digital Humanities.

    1. again, as I'm reading this; I may be zeroing in on the wrong elements

      Keep in mind that the branches can be swung around, so Sense and Sensibility could appear to the right of Emma, and the short vertical branch between Sense and Sensibility and the Emma/Pride and Prejudice clade indicates a close relationship. By this reading, I think that what you have is a pre-20th century clade whee the three novels by Brontü are closest together because of their common authorship. The Zelazny story shows some distant affinity to these works, and Lovecraft is even more stylistically distinct from the other texts than Austen is.

    1. a new layer to observation that is done outside of digital humanities

      This is potentially important, but I'm not sure what it means. Can you say more?

    1. the herculean task of project design

      An astute observation about DH, or at least screwmeneutic studies like this one. In the Humanities, we rarely go about things this way.

    2. that leabing in all those commonly used words would increase similarity

      One common observation about authorship attribution is that the authorial "fingerprints" are actually to be found in patterns of usage for function words (prepositions, pronouns, articles), which are typically treated as stop words.

    3. Sense and Northanger

      With your smaller set of stop words, the Sense/Northanger clade is flatter, suggesting that the differences aren't as great as when you did not apply a stop word list.

    1. the results would probably only reflect the subjects of the chapters.

      This is not necessarily bad, depending on what you are studying. However, a way to get around the problem is to cut the novels into arbitrary segments of a particular size. You may find, then, that the odd segment does not cluster with the rest of the novel (or chapter), but with text from elsewhere. When you encounter that, it's really interesting, and some explanation is required.

    1. python code

      See the instructions under Code, Code Blocks, and Syntax Highlighting here for instructions on how to represent code in Markdown.

    2. likely the seed started at a different token

      Yes, that is exactly what happened. You can set the random-seed to an arbitrary number so that the model can be duplicated. However, even when you get the same topic, it may not come out with the same topic number. I haven't tested that.

  5. Sep 2018
    1. Sometimes you don't have a particular question you're asking, but when you have data sets to analyze, something will pop out that deserves further inquiry.

      I think you may be putting your finger on something important here: that DH works through a combination of screwmeneutics and serendipity. You might get some insight from a talk I gave in Lausanne a few years ago on Play as Process and Product.

    1. And yet the platypus, somewhat unhelpfully, continues to exist.

      You've perhaps inadvertently hit on an interesting convergence of literary text analysis and biology. Some of the algorithms we use are borrowed from bioinformatics and might, in that field, be seen to use probability as a way to overcome the shortcomings of Linnean (and similar) classifications. I would rephrase as "And yet the platypus, somewhat improbably, continues to exist".

    1. The authors seemed to prove something already apparent but necessary; to be shown from a new perspective.

      I think that a common criticism of DH is still that it demonstrates what is apparent or obvious with the implication (from its critics) that that is somehow unnecessary. That might be worth some class discussion.

    2. “Quiet Transformations” is difficult to follow because of how much you have to know about topic modeling before reading it.

      You are not the only one to observe this. I may have to change the order of readings in the future. The major reason I set the article for this week is to give you an idea of what it might mean to read at scale.

    1. Is there a taboo against touching these more "traditional" texts with these methods, in a way that there isn't for the Silmarillion?

      Amongst digital humanists, no. But we might think a little bit about whether quantitative methods are seen by others as somehow violating the sacred qualities of our canonical literature.

    2. "reasonably objective evidence"

      You're right to highlight this phrase. Others may have comments to make, or we can discuss it in class.

    3. To start with, I would like to clarify my title. I do not mean to imply that Lord of the Rings is not a quality text, or a "classic" in its own right.

      Is the distinction that you're trying to make that The Lord of the Rings is not "canonical"?

    1. I don't hold out hope of some Grand Unified Theory which will forever bind the various schools; but rather, I wonder if such consilience might rather take the form of several branching, occasionally intersecting, paths headed in roughly the same direction.

      I don't think you quite go in this direction, but it follows nicely from your point. Hyperspecialisation (and related) factors have prevented students from feeling comfortable with the territory of, say, computer science, making them afraid to learn computational skills that they might usefully apply in their work. But for students willing to learn coding or other skills typically considered outside the domain of English studies there are real opportunities for insight. For further information on this, try Googling "C.P. Snow Two Cultures", and you'll get a lot of material about how we came to this pass.

    1. The details may possibly end up becoming among the dozens or hundreds of ways to explain what he or she already teaches students.

      Historically, scholarship in literary studies was much more focused on details than it is today. Are DH methods doomed to be ignored for being detail-oriented by those who might be potential audiences for the results of those methods? Is not being detail oriented a good thing? It is a skill I try to teach my students, especially in their writing, by I admit that my successes have been few...

    1. remarkable scientific discoveries

      I think that there might be a fear that this is inappropriate for Humanities. Traditional Humanities scholarship privileges the completed, polished written argument. Lacking empirical notions of "truth", work in the Humanities tends to rely on persuasiveness, which is dependent on rhetoric — and the rise of poststructuralist theory has exacerbated (or exaggerated?) this phenomenon. Is Ramsay arguing for a more "scientific" way of doing literary scholarship (which is at the same time not science)?

    2. Who knows how long this will last, but it shows, if nothing else, a bit of the Hegelian dialectic, and we haven’t yet made it to the synthesis phase.

      I would add that the rise of these new technologies has coincided with a growing interest in book history in English departments.

    3. designers of algorithms often do place too much trust in them

      This is sort of an anecdote to reflect on. I recently met a woman who was involved in training many of the designers of algorithms for companies like Facebook. She pointed out that the designers are often fully aware of the shortcomings of their algorithms, but that their reservations don't get voiced properly because the business model requires releasing products on a deadline. The limitations are lost in flurry of marketing cycles and hype.

    1. After all, few would argue that certain books have influenced literature more than others.

      Is this true? Are there ways in which we might measure "influence"?

    1. Does Google hinder the browsing Marche discussed?

      I'm not sure I would target Google specifically, but there has been a general observation that people no longer go to the library and "serendipitously" discover something interesting sitting on the shelf next to the book they were looking for. This was the inspiration behind Serendip-o-matic, which (full disclosure), I worked on. Take a look at that project to see how the issue was addressed.

    1. algorithm(s) do

      I think what you are saying here is that the algorithm is a black box. Marche's essay does not reveal its inner workings or point us to a code repository where we can explore it.

      However, you might also be referring to a kind of epistemological/hermeneutic question. Even if we understand that algorithmic process, what do we understand about the nature of the results of that process?

    2. a highly subjective process

      One of the issues that typically accompanies the "subjectivity" question is the "representativeness" question. How do you assemble a representative corpus, and what makes it representative.

    3. The Stephen Marche of 2012 who penned (keyed?) "Literature Is not Data: Against Digital Humanities" will no doubt be pleased to note that the project that Stephen Marche of 2017 participated in was heavy with humanistic interpretive acts

      This is a really important observation. DH is often accused by critics of not being interpretive, but it says something that one such critic dramatically demonstrates the role of human interpretation when he actually tries to implement a DH method himself.

    1. Here are some things to think about:

      • The Zine Scenes website is hosted on Tumblr. What are the consequences of choosing that technology. What are the advantages and disadvantages for the stated purposes of the project. What different choices could have been made?
      • What intellectual property issues arise in acquiring and distributing zine source materials?
      • How would you want to deploy computational resources to analyse, as opposed to disseminate, zine materials?
  6. Aug 2013