16 Matching Annotations
  1. Mar 2022
    1. Let’s just stop thinking data is perfect. It’s not. Data is primarily human-made. “Data-driven” doesn’t mean “unmistakably true,” and it never did. <img src="https://images.squarespace-cdn.com/content/v1/550de105e4b05c49fa2bba03/1532010974837-QDSJJW41PSOXSMJSU43U/1_izLAcAs9dhmSmYGg2MRdlg.jpeg" alt="1_izLAcAs9dhmSmYGg2MRdlg.jpeg" />

      I found this quote interesting because it reminds us ho wall data has some sort of flaw to it, especially when it comes to data on human or society, there will always be some typeof error, as it is human made

    1. The data collectors put their faith in the algorithms, and the programmers put their faith in the data. At no point in this process is there any understanding, or wisdom. There’s not even domain knowledge. Data science is the universal answer, no matter the question.

      I found this quote very interesting because it shows the two sides of data science, the programmers who write the algorithms and the data collectors who review the data. It is peculiar to see how both of them have their faith in the opposite things.

    1. Of the 17 apps that The Times saw sending precise location data, just three on iOS and one on Android told users in a prompt during the permission process that the information could be used for advertising

      I found this interesting because, Apple now has the option on the app, where the app now has to ask you track and save your data. However, I feel like there is definitely other metadata being tracked by the companies through the app.

    1. Building multilingual capabilities into Ocular, for example, allowed us to change the shape of early modern OCR by bringing multilingual documents to the center of automatic transcription rather than imposing an anachronistic, monolingual concept of textuality onto the early modern period. Manually altering our language data similarly allowed the system to be attuned to the unique orthographic qualities of the books in our corpus.

      I found this very interesting because it shows us how the different levels of OCR can affect multiple groups of researchers. As someone who has used OCR before for, I have had to manually change some words based off the dirty OCR.

  2. Feb 2022
    1. Yet the Goodreads classics depart from these school-sanctioned lists in two particularly striking ways. First, the Goodreads classics are considerably less diverse in terms of the race and ethnicity of their authors. Race is extremely complex and difficult to reduce to data, especially because racial categories differ across different societies.

      This part stuck out to me because its quantifying race is a very tricks subject and trying to "tag" it into categories can be scene as very difficult and touchy.

    1. his sort of scraping is a commonplace technique that supports research in the public interest, among other beneficial uses,” Crocker said. “As the court recognized, access to publicly available websites is not access ‘without authorization’ under the CFAA, nor does sending a cease and desist letter make such access unauthorized.”

      Something I found interesting was how scraping can be used for multiple reasons, and even though the data might be used just for research, it can very easily be monetized and sold

  3. knowyourdata-tfds.withgoogle.com knowyourdata-tfds.withgoogle.com
    1. Source features

      This website was interesting to me because this type of data was all pictures, and something as simple as a basic 3d shape, has 480,000 items in which we can analyze.

    1. But getting to know your data can reveal crucial gaps, bias, misinformation, or overlooked details in your story

      I agree with this because data at first look may look deceiving, and until you analyze it thoroughly, you will not find out what is missing, and it is important to be somewhat skeptical about data sets.

    1. We find ourselves needing models for gender that can accommodate much more nuance than our current standards

      This is an example of a struggle where humanists who struggle with data may not see eye to eye with analysts because they interpret data differently.

    1. The amount of dialogue, by age-range, is completely opposite for women versus men.

      This is very interesting to me because even though it is well known that the movie industry is dominated by men, i did not expect the data comparing men and women in movies to have such a drastic gap.

    1. That which we ignore reveals more than what we give our attention to. It’s in these things that we find cultural and colloquial hints of what is deemed important.

      This reminded me how when data is collected, not all of the data is used or emphasized if it does not meet the answer to an already existed argument or supported thesis. Data can be disregarded if it does not meet the narrative.

    1. It turned out that the young woman was indeed pregnant. Pole’s model informed Target before the teenager informed her family. By analyzing the purchase dates of approximately twenty-five common products, such as unscented lotion and large bags of cotton balls, the model found a set of purchase patterns that were highly correlated with pregnancy status and expected due date.

      This comes to show how personal user data is used and analyzed by companies, who are able to find trends and then use such information, to increase their select and direct marketing.

    1. . The historical distance just isn’t that great. And the ghosts of slavery are still very much visible today.

      I think the article perfectly wraps ups up by stating how we can still see the lingering affects of Slavery. This can be seen by comparing the southern states where the most Slavery activity took place, with the jail incarceration rate.

    1. The ambitious aim of the site was to assemble and reference all surviving information about every trans-Atlantic slave voyage - whether that info came from a document or a secondary source. If anyone—within the academy or not—needs to know about any voyage in the 350 years of Atlantic slave trading, then our intention has always been that they would find it in our databases

      The objective of finishing this project and trying to fill in all the gaps through a combination of sources, to me seems really cool, and I think could be deemed very useful for future research or studies.

    1. This database compiles information about more than 36,000 voyages that forcibly transported enslaved Africans across the Atlantic between 1514 and 1866

      I thought this was very interesting on how the databased was able to record and display such intricate and old data, based off of previous old records

    1. Data is the evidence of terror, and the idea of data as fundamental and objective information, as Fogel and Engerman found,47 obscures rather than reveals the scene of the crime. Black digital practice offers a corrective.

      After reading this far in to the paper, I understand how black digital practice is a reminder and evidence of the horror that took place during slavery, and it also serves as a reminder