3,402 Matching Annotations
  1. Sep 2017
    1. Surman and Reilly (2003) focus on appropriation of networked technologies in a strategically, politically, and creatively innovative manner oriented toward social change. In this context of advocacy, effective technology appropriation includes strategic Internet use for collaboration, publishing, mobilization, and observation. Here, the delineation between the use and appropriation occurs when technology is adapted to reflect goals and culture. Camacho (2001) describes appropriation by civil society organizations at the pinnacle of a technology use ladder. In the middle of the ladder, organizations focus on adoption of conventional technology. Toward the bottom, organizations and individuals with constrained access or slow adoption rates lag behind and seek access to technology. At the pinnacle, however, pioneers and activists appropriate technology to promote causes, for instance, creating flash mobs through mass text messaging to instantaneously organize large groups of people for social protest

      Desde el comienzo, el Data Week ha estado preocupado por la perspectiva de transformación social en la apropiación tecnológica al estar vinculada con la creación de capacidad en la base, modificación de la infraestructura y la amplificación de voces ciudadanas frente a iniciativas privadas o públicas.

    2. Appropriation considers both learn-ing-by-using and learning-by-doing as central to the development of new processes (Rosenberg, 1982). Learning-by-doing in particular assumes that knowledge emerges through bricolage—tinkering with and recombining technology elements, thus enhancing one’s understanding (Lévi-Strauss, 1966). For Tuomi (2002), learning through appropria-tion is a user-centered process whereby users meld culture with material resources to innovate.

      Los principios de diseño en Smalltalk hacen que aprender haciendo sea uno de los elementos centrales de la estética de encuentro que favorece esta materialidad.

      El continuo donde no hay diferencia entre el entorno de desarrollo y el de usuario, o entre usuario y hacedor, favorece el aprender haciendo. Ejercicios como los del Manual de Periodismo de Datos, son un llamado más directo a la acción y, por tanto, a aprender haciendo, particularmente acá, hemos visto mediante el ejemplo cómo cambiar la herramienta para adecuarla a las necesidades que el ejercicio mismo del manual ha sucitado, pero que se pueden abstraer para otros problemas.

    1. Over the course of many years, every school has refined and perfected the connections LMSs have into a wide variety of other campus systems including authentication systems, identity management systems, student information systems, assessment-related learning tools, library systems, digital textbook systems, and other content repositories. APIs and standards have decreased the complexity of supporting these connections, and over time it has become easier and more common to connect LMSs to – in some cases – several dozen or more other systems. This level of integration gives LMSs much more utility than they have out of the box – and also more “stickiness” that causes them to become harder to move away from. For LMS alternatives, achieving this same level of connectedness, particularly considering how brittle these connections can sometimes become over time, is a very difficult thing to achieve.
    1. Inconsistencies in ideological perspective are not uncommon in activities which attempt to balance individualism and communalism. These very frictions might belie possibilities for a greater imagination or shared experience. However, we argue that it is only in the disputes and frictions between pluralities of publics that democratization emerges. Dissensus across making and hacking communities allows people to experiment, eventu-ally finding communities and processes in which they feel comfortable and can identify. These very migrations, connected by fluid narratives and practices, drive the capacity of communities to develop and innovate.

      Esta idea de fluidez y confrontación también la vivimos en HackBo, con miradas encontradas sobre la gestión del espacio y la falta de apoyo colectiva a determinadas iniciativas colectivas, lo cual permitió replantear nuevas dinámicas y establecer nuevos grupos.

    1. .Theopendatadefinitionthatemergedfocusedoneightqualitiesofdata:completeness,pri-macy,timeliness,easeofphysicalandelectronicaccess,machinereadability,non-discrimination,useofcommonlyownedstandards,licensing,permanence,andusagecosts.
    1. civic hackers often act as the “slow food movement” of digital political action, embracing local sourcing, ethical consumption, and pleasure of community work. Civic hackers are hardly responding to a new narrative of technological change.

      [...] Civic hackers thus speak against those who propose that the application of technology to politics produces a meta-category of activist.

      Interesante la comparación con el slow food y lo que se puede hacer localmente.

    2. Might “utopian realist” be applicable to the practices of civic hackers, intertwined with particular repertoires, technologies, and affective publics? McKenzie Wark (2014) sug-gests that the relationship between utopian and realist might be mutually constitutive rather than dialectical. He re-frames utopia as a realizable fragment or diagram that re-imagines relations. From this perspective, civic hacking gets traction not because they were ever intended to be the sole “solution” to a problem, but they are ways of acting and creating that are immediately apprehensible. Prototypes capture the imagination because they are shards of a possible future and can be created, modified, and argued about (Coleman, 2009).
    3. Civic hackers might be most appropriately described as utopian realists (Giddens, 1990: 154), a term Giddens employed to capture how assuaging negative consequences in a risk society required retaining Marx’s concern of connecting social change to insti-tutional possibilities while leaving behind his formulation of history as determining and reliance on the proletariat as change agents. He positioned utopian realists as sensitive to social change, capable of creating positive models of society, and connecting with life politics.
    4. That “hackers” can model beneficial process disrupts the often presumed subversive nature of hacking as much as it does easy assumptions about a Foucaultian notion of governmentality. Prototypes act as working evidence to lobby for changing government process, particularly those that improve digital infrastructure or direct communication with citizens. The capa-bility of code to act as a persuasive argument has long been noted, and modeling can produce charged debates about the very meaning of “civic.”

      [...] On a level of hackathons, prototypes can be speculative (Lodato and DiSalvo, in press) rather than an “outcome,” revealing conflicting notions of “civic tech” (Shaw, 2014).

      Nuestro enfoque ha estado centrado más en la modelación, que es requerida para la visualización, pero también en la idea de construir capacidad en la infraestructura y en la comunidad, lo cual va más allá del prototipo volátil, que se abandona después.

    5. The most popular apps to date have been highly instrumental ways to request services to fix city infrastructure, such as SeeClickFix, a platform that lets residents take pictures of issues that need repair, that are delivered to the appropriate city department as an actionable item. We might think of this as a base-level civic act similar to picking litter off the ground or paint-ing over graffiti. Other activities are thicker modes of participation by generating data or metadata. The primary effort of the 2014 CodeAcross effort was to map exist-ing sources of open data. The leader of the event, D.W. Ferrell, described “our role as citizens is to complement” efforts by the government and organizations such as CfA. Contributing to data repositories served purposes for multiple stakeholders:

      Esta perspectiva instrumental (en el sentido latino, no inglés) se ve en el solucionismo de crear una app para salvar el mundo. Nuestra aproximación es más crear competencias críticas, mediadas por la programación y los datos para conversar con el gobierno.

    6. dis-putes over community as a particular category threatens to distract from a general focus on solidarity by activating “social bases of discursive publics that engage peo-ple across lines of basic difference in collective identities” (p. 374). A mutable, popu-larized hacker identity may have this potential, capable of processing and interpreting abstract systems of regulation.

      processing and interpreting abstact systems of regulation.

      La idea de usar la tecnología digital para aumentar nuestra capacidad de agencia en sistemas tecno-políticos complejos. Esto está en confluencia con los argumentos de Bret Victor, pero asume una perspectiva más política.

      Acá la idea es que los hackers cívicos pueden ser puentes entre distintas comunidades. De alguna manera, esto está pasando con los Data Week y cómo articulan distintos públicos.

    7. Civic hackers widely view machine-readable data as more useful because it drives a wider variety of potential uses, even as the shift from informational uses raises the bar to the literacies required to interpret it. In civic hackathons, knowledge of government operations was as useful as technical knowledge.

      En el Data Week intentamos poner a dialogar estos alfabetismos críticos y crear capacidades en la base. En la tercera edición, por ejemplo, además de trabajar con el código, también modelábamos cómo el lenguaje de contratación estatal se colocaba dentro del entorno computacional.

    8. Data activism and advocacy ranges from civic engagement (Putnam, 2001) to more oppositional activism (Jordan, 2001). In this sense, it is a spe-cific association of technologically mediated participation with particular political goals (Lievrouw, 2011) resulting in a wide range of tactics. Although open government data is still evolving and is constrained by predictions for economic growth and self-regulation, I argue it enables civic hackers to participate in civic data politics. This is particularly important because data-driven environment is often distanced from pro-viding individuals a sense of agency to change their conditions (Couldry and Powell, 2014). Data activism and advocacy can take place through organizing on related top-ics, online through mediated data repositories such as Github, and in-person events such as hackathons.

      [...]Contributing, modeling, and contesting stem from residents leveraging possibilities of open data and software production to attempt to alter process of governance.

      En este amplio espectro, sería chévere ver maneras de gobernanza y cómo pasan a la esfera de lo público y se articulan con los bienes comunes y las entidades encargadas de preservarlos y extenderlos.

      Esta transición aún está desarticulada y no la hemos visto. Las formas de gobernanza de HackBo, aún se encuentran distantes de las formas institucionales públicas, privadas y del tercer sector (ONG), si bien piezas de este rompecabezas se exploran en paralelo, su escalamiento a nivel ciudad aún está por verse.

    9. Open data, like open information before it, promised fixes for bureaucratic problems and leveling power asymmetries (Fenster, 2012). Municipal governments strapped for funds and in dire need of more efficient frameworks have, of course, welcomed the message that open government data can alleviate time-consuming FOIA requests, make services easier for residents to use, and drive hack-athons as a form of public outreach.

      Interesante ver cómo CfA ha permitido el tránsito del sector ONG al público (ver párrafo anterior).

    10. The open data definition drafted at Sebastopol describes data’s completeness, primacy, timeliness, ease of physical and electronic access, machine readability, non-discrimination, use of commonly owned standards, licensing, permanence, and usage costs. This description made it clear what the proper-ties of data were, even as outcomes, fitting with an open-source model, were more

      [...] ambitious

    11. His stance was not cyberlibertarian (Barbrook and Cameron, 1996). As his successive refutation of transparency in this shift toward open data indicates (Lessig, 2009), he was quite concerned about efforts with software becoming distanced from tangible outcomes. Lessig might regarded as a hacker in the mold of Tim Jordan (2008), taking a progressive perspective on how we might regulate technologies—alongside laws, norms, and markets—that affect behavior.
    12. First, FOIA provided accessible tools to put abstract ideas into practice. Everyday citizens started to attach various political notions to these activities. Second, information flowed into a journalistic ecosystem that was prepared to process and interpret it for everyday citizens. Information obtained through FOIA was being interpreted in stories that changed public opinion (Leff etal., 1986). Third, ability for individuals to request information led to alternate uses for activ-ists, public interest groups, and non-profit organizations.

      Interesante ver cómo se conectan el periodismo y el activismo. Una necesidad de dicha conexión ya había sido establecida en la entrada sobre los Panamá Papers.

    13. Yet, the natural equating of “openness” or government transparency (Hood and Heald, 2006) with accountability increasingly became dubious (Tkacz, 2012). The move to “open data” was often an imperative that didn’t make clear where the levers were for social change that benefited citizens (Lessig, 2009). Still, I argue that civic hackers are often uniquely positioned to act on issues of public concern; they are in touch with local communities, with technical skills and, in many cases, institutional and legal literacies. I conclude by connecting the open data movement with a specific set of political tactics—requesting, digesting, contributing, modeling, and contesting data.

      Transparencia y reponsabilidad no son lo mismo y no hay vínculos entre lo uno y lo otro directos. Los ofrecimiento gubernamentales de datos son sobre "emprendimiento" y no sobre reponsabilidad y trazabilidad.

      Sin embargo, los saberes locales que ponen datos como una forma de acción política ciudadana, que incluye la contestación han sido evidenciados en HackBo, con el Data Week y las Data Rodas.

    14. In each case data was framed as repressive of notions of civil society or enforcing an impoverished or constrictive notion of citizenship. The perspectives of Tufekci and Cheney-Lippold provide valuable insight into how algorithms and data are powerful shapers of modern life. Yet, they leave little room for a different form of algorithmic citizenship that might emerge where indi-viduals desire to reform technology and data-driven processes. As Couldry and Powell (2014) note, models of algorithmic power (Beer, 2009; Lash, 2007) tend to downplay questions of individual agency. They suggest a need to “highlight not just the risks of creating and sharing data, but the opportunities as well” (p. 5). We should be attentive to moments where meaningful change can occur, even if those changes are fraught with forces of neoliberalism and tinged with technocracy.
    15. Civic hacking can broadly be described as a form of alternative/activist media that “employ or modify the communication artifacts, practices, and social arrangements of new information and communication technologies to challenge or alter dominant, expected, or accepted ways of doing society, culture, and politics” (Lievrouw, 2011: 19). Ample research has considered how changes in technology and access have created “an environment for politics that is increasingly information-rich and communication-inten-sive” (Bimber, 2001). Earl and Kimport (2011) argue that such digital activism draws attention to modes of protest—“digital repertoires of contention” (p. 180)—more than formalized political movements

      La idea de tener "repertorios de contención" es similar a la de exaptación en el diseño.

    16. Organizations such as Code for America (CfA) rallied support by positioning civic hacking as a mode of direct partici-pation in improving structures of governance. However, critics objected to the involve-ment of corporations in civic hacking as well as their dubious political alignment and non-grassroots origins. Critical historian Evgeny Morozov (2013a) suggested that “civic hacker” is an apolitical category imposed by ideologies of “scientism” emanating from Silicon Valley. Tom Slee (2012) similarly described the open data movement as co-opted and neoliberalist.
    17. Successive waves of activists saw the Internet as a tool for transparency. The framing of openness shifted in meaning from information to data, weakening of mechanisms for accountability even as it opened up new forms of political participation. Drawing on a year of interviews and participant observation, I suggest civic data hacking can be framed as a form of data activism and advocacy: requesting, digesting, contributing to, modeling, and contesting data
    1. heart of sociological thought is the belief that we are all  a part of a vast tapestry of social connections.

      As I think about this, I am always a bit perplexed as to why SNA is not more foundational to Sociology. SNA reveals that which is very fabric of our society. Why is is not more utilized as a methodology? I suspect it has something to do with how hard it is to collect data.

    1. aspect of an average person’s life to very soundly prove their point.

      These outcomes have been linked to friends and social contacts. Research asks how many friends or how often do you socialize? While this hints at the issue of networks, asking for lists or numbers does not produce network data. You have to find the links between people and between those people that people know.

    1. encounter we make, relationship we build, or key-stroke on our technological device builds a network.

      This is what is called 'native data'; data that is built through everyday behaviors.

    1. ask people to list those in their social circles who have intervened in abusive situations, people they have talked to about bystander intervention, or people whose opinion on intervening is important to them.

      What would be the links between these people? If you asked someone to list their friends, you will get lists which produce a star network. There needs to be a second round of questions involving friends of friends. Getting network data requires asking interrelated people.

    1. Projects, as an entanglement, are “open, partial and indeterminate” (Hodder, 2012, p. 159).They might be being showed off at the next open house, orentirely forgotten. Peter noted that his hard drive was filling up with “functionally dead” shelved projects that got boring or require expertise from outside the space. Yet, to Peter failure was an indication of project success. He believed, as scholars of innovation do, that embracing failure contributed to better ideas. Projects also carried an ethical charge. Michael, a quiet member and software professional, reflected at length on what he called the “philosophical” side of projects. He saw them an inroad to “participation in the fabrication process” that was empowering. In his words, projects were “manufacturing liberation.”

      A pesar de los proyectos inacabados de HackBo, antes mencionados, Grafoscopio, el Data Week y las Data Rodas, tienen la intensión de permitir saltos desde la infraestructura y dar una noción de continuidad (ver gráfica de Markus). Se pueden mostrar en el siguiente evento, pero definitivamente no son para ser desechados. Manufacturar liberación es importante para tales proyectos, pero al confinar la apertura (preestableciendo tecnologías y temáticas) sobreviven a futuras iteraciones.

    2. These more formalized gatherings were an attempt to get people working and collaborating in a space that had mainly turned into a spot for hanging out, drinking and foosball. The shift to the new space was seen as an opportunity to encourage members to use the space in a more productive way. The space needed members as much as members needed the tools. Members echoed a liberalist concern with increasing freedom of individuals to act, while retaining hobbyist cultures’ engagement with materialities. Often hackerspace members also described the need for a hackerspace as part of a shift in their city’s economy

      Discusiones similares sobre proyectos compartidos se tuvieron en HackBo al comienzo, con ideas como lanzar un globo a la estratósfera, hacer crowdfunding de hardware y otras, que tenían que ver con "reuniones de segumiento". Algunos de ellos convocaron a miembros por poco tiempo y atrayeron nuevos miembros de manera permanente. Sin embargo, los tres proyectos que más se mantienen son: dos empresas/fundaciones y el de las Data Week y Data Rodas alrededor de Grafoscopio.

      Los cambios de escala ciudad han sido conversados de manera informal, pero nunca han cristalizado y salvo acciones de activismo específica como la Gobernatón, no logran impactos de escala ciudad.

    3. Interactions through things, and perceptions about their potential, were ways to negotiate between seemingly conflicting imperatives of the individualism and communalism (A. L. Toombs, Bardzell, & Bardzell). Members would deliberately design activities that were incomplete to encourage a playful material improvisation. In these ways, the “material sensibilities” of members were particularly important. Similarly, reading a history of craft into software hacking, Lingel and Regan (2014) found that software hackers identified their work with craft as process, embodiment, and community. These sensitive readings of interactions with stuff seemed to more accurately capture the genre of hackerspaces, more so than action was guided by culture.

      La idea de actividades incompletas y un jugueteo material están embebidas en el Data Week y Grafoscopio, así como la identificación de software como artesanía, lo cual dialoga con Aaron y Software craftmanship.

    4. This work perceptively suggested that people often don’t arrive at hackerspaces with an identity fully-formed. Tools and projects, as socio-material assemblages, shepherded new arrivals in and helped them understand

      themeselves in relation to the group. “The process of becoming such an established maker seems to rely less on inherent abilities, skills, or intelligence per se, and more on adopting an outlook about one’s agency”

      Esto ha pasado con el Data Week y Grafoscopio y está vinculado a comunidades de práctica y lo identitario.

      Se puede empezar por acá la caracterización de lo hacker!

    5. That hackers are created, not born, is hardly a new claim. In Coding Freedom Gabriella Coleman described how an open-source hackers’ identity emerged from a fervent brew of digital connectivity, technological concepts, and shared work (Coleman, 2012). Political awareness was connected to liberalism through open-source and code over time. Put simply, being a hacker is a trajectory with multiple points of origin and destinations. Neither is suggesting that hackers are ordinary meant to discard a concern with exceptional hackers. We should be concerned with the Chelsea Mannings and Edward Snowdens of the world, and the causes they have championed.

      Can the data activism be a connection between the concerns of the ordinary and the extraordinay hacker? The Data Week experience seems to support this claim, as a frequent activity in our common hackerspace, that invites a diverse group of people but put activist concerns as a explicit topic, instead of the neutralized "hello world" introduction to technology.

    6. Members often toutthat “anyone can be a hacker.”While this claim is dubious– participation is limited by technical inclinations, skills, and comfort hanging around rowdy spaces –hackerspaces certainly helpproduce an “ordinary hacker.” Theyare sites where we can observe hacking’s movement from subculture to mainstream, and from an edgy to popular identity.

      Son los hackerspaces los espacios donde los hackers crean a los hackers, como un "bien recursivo" social. Habría que ver cómo es ese "hacker ordinario" y esos espacios de estéticas echizas y las preferencias de la gente afiliada por ellos y cómo esto configura o restringe formas de participación.

      Está creando el Data Week otro tipo de hacker que no es el ordinario, al tener llamados y poblaciones más diversas.

    1. extremely cool, but...

      comparing with tahoe-lafs:

      clearly separates writecap from readcap, but... does it grok readcap as separate from idcap?

      client-side encryption?

      n-of-k erasure encoding?

  2. Aug 2017
    1. Embracing a culture of sharing that breaks down silos while maintaining ethical and privacy standards will be paramount.

      This is gnarly stuff though and deserves its own deep dive/bullet point.

    1. This has much in common with a customer relationship management system and facilitates the workflow around interventions as well as various visualisations.  It’s unclear how the at risk metric is calculated but a more sophisticated predictive analytics engine might help in this regard.

      Have yet to notice much discussion of the relationships between SIS (Student Information Systems), CRM (Customer Relationship Management), ERP (Enterprise Resource Planning), and LMS (Learning Management Systems).

    1. In fact, academics now regularly tap into the reservoir of digitized material that Google helped create, using it as a dataset they can query, even if they can’t consume full texts.

      It's good to understand that exploring a corpus for "brainstorming" or discovering heretofore seen connections is different than a discovery query that is meant to give access to an entire text.

  3. Jul 2017
    1. BMI

      Test if this shows up in another list.

    2. Finally found its BMI distribution... turns out to be in demographic category. So most samples from this study have BMI > 24. Good for us.

    1. Partial loss-of-func- tion alleles cause the preferential loss of ventral structures and the expansion of remaining lateral and dorsal struc- tures (Figure 1 c) (Anderson and Niisslein-Volhard, 1988). These loss-of-function mutations in spz produce the same phenotypes as maternal effect mutations in the 10 other genes of the dorsal group.

      This paper has been curated by Flybase.

    1. Obesity rs8043757 intron FTO 16 : 53,779,538 5.000 x 10-110 NHGRI 23563607

      The top match SNP with key words: Obesity, T2D and CVD is on gene FTO.

    1. Obesity was highly prevalent among the study sample; 64.6% of females and 41.2% of males were obese according to Polynesian cutoffs (BMI ≥ 32 kg/m2). Females were less likely than males to have hypertension (31.7% vs. 36.7%) but equally likely to have diabetes (17.8% vs. 16.4%).

      Those with obesity but not hypertension or diabetes can be our candidates.

      The data set can be found here: dbGaP Study Accession: phs000972.v2.p1 https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000972.v2.p1

    1. This T2D study measured BMI, DBP, SBP and cardiovascular disease medications as well. May have samples we need.

    1. Samoans have been studied for >40 years with a focus on the increase in, and levels of, BMI, obesity, and associated cardiometabolic conditions due to economic modernization

      This one may contain the sample we need. need to check their publications.

    1. ((obesity[Disease]) NOT type 2 diabetes mellitus[Disease]) NOT cardiovascular diseases[Disease] AND 1[s_discriminator]

      NCBI can save this query for me... I can annotate this as well.

    1. Most proposals (148) were for a new study and publication, with confirmation of original studies’ results (3) being quite uncommon.

      This is a very important statistic to quote, because people always assume the negative use case, i.e., "weaponizing data". I think that does happen, but the more we can gather the statistics, the better we are able to address it.

  4. Jun 2017
    1. t’s noteworthy that 174 of the 177 proposals submitted did not involve reanalysis of original results or disproving of the findings of the original analyses; the bulk of the proposals were for new research. Thus, sponsors’ fears regarding making these data available have not been realized.

      Good statistic to have

    1. Yet, for all the seeming convenience of Microsoft Excel (and its ilk), we pay a hefty price — our time and sanity. “Hyperbole!” I hear you shout. “Nonsense!” I hear you cry. And, when these initial protestations fade, we are left with the ever popular: “I have a system.”

      If I had a dollar

    1. literature became data

      Doesn't this obfuscate the process? Literature became digital. Digital enables a wide range of futther activity to take place on top of literature, including, perhaps, it's datafication.

  5. May 2017
    1. What is clear, is that data are increasingly conceptualized as inherently valuable products of scientific research, rather than as components of the research process

      Data is beginning to be seen as valuable rather than a left-hand component of the research process.

    2. the vast majority of scientific data generated in the 20th century have only been accessed by small groups of experts; and few of those data, selected in relation to the inferences made, have been made publicly available in scientific journal

      The vast majority of data is accessed only by the investigators

    3. The real prize for society is not simply producing open data but facilitating open innovation. Open data enables a situation where the collective genius of thousands of researchers produces insights and analyses, inventions and understanding beyond what isolated individuals with their silos of data could produce.

      Shadbolt on what open data means

    1. volume, velocity, and variety

      volume: The actual size of traffic

      Velocity: How fast does the traffic show up.

      Variety: Refers to data that can be unstructured, semi structured or multi structured.

    1. a tax plan

      Simplify the tax code.

      Evolve public accounting/finance into a more real-time, open, and interactive public service. Transaction-level financial data should be available internally and externally.

      Participatory budgeting and other forms of public input should be well-factored into the public-planning process. 21st century government participation can be simplified and enriched at the same time.

    1. loopholes proliferated, and the tax code grew more complex

      correlated? causative?

      complexity in law, leads to more logic to parse and process - therefore more potential ambiguity in human-processing.

      does software engineering practices about code complexity (or lack thereof) have fruitful applications here?

  6. Apr 2017
    1. The Echo Look suffers from two dovetailing issues: the overwhelming potential for invasive data collection, and Amazon’s lack of a clear policy on how it might prevent that.

      Important to remember. Amazon shares very little about what it collects and what it does with what it collects.

    1. En produisant des services gratuits (ou très accessibles), performants et à haute valeur ajoutée pour les données qu’ils produisent, ces entreprises captent une gigantesque part des activités numériques des utilisateurs. Elles deviennent dès lors les principaux fournisseurs de services avec lesquels les gouvernements doivent composer s’ils veulent appliquer le droit, en particulier dans le cadre de la surveillance des populations et des opérations de sécurité.

      Voilà pourquoi les GAFAM sont aussi puissants (voire plus) que des États.

  7. Mar 2017
    1. 沒有看到許委員的『數位經濟基本法』原始草案全文。在討論過程中有一些問題,例如數位經濟的基本定義;對資料產業沒有處理,以至於詹先生對國家保存資料的想像可能低估技術、行政程序,導致不恰當地反映在資料保存相關條文等處,不曉得是否有修正與否。

    1. Furthermore, the results could focus on drawing the user into the virtual app space (immersive) or could use the portable nature of tablet to extend the experience into the physical space inhabited by the user (something I have called ’emersive’). Generative (emersive) Books that project coloured ambient light and/or audio into a darkened space Generative (immersive) Books that display abstracted video/audio from cameras/microphone, collaged or augmented with pre-designed content Books that contain location specific content from the internet combined with pre-authored/designed content

      Estas líneas y las siguientes definen un conjunto interesante de posibilidades para las publicaciones digitales. ¿Cómo podemos hacerles Bootstrap desde lo que ya tenemos? (ejp: Grafoscopio y el Data Week).

    2. Some key themes arise from the two NNG reports on iPad usability: App designers should ensure perceived affordances / discoverability There is a lack of consistency between apps, lots of ‘wacky’ interaction methods. Designers should draw upon existing conventions (either OS or web) or users won’t know what to do. These are practical interaction design observations, but from a particular perspective, that of perceptual psychology. These conclusions are arrived at through a linear, rather than lateral process. By giving weight to building upon existing convention, because they are familiar to the user, there is a danger that genuinely new ideas (and the kind of ambition called for by Victor Bret) within tablet design will be suppressed. Kay’s vision of the Dynabook came from lateral thinking, and thinking about how children learn. Shouldn’t the items that we design for this device be generated in the same way?

      The idea of lateral thinking here is the key one. Can informatics be designed by nurturing lateral thinking? That seems related with the Jonas flopology

    1. A first list of projects are available here but more can be found by interacting with mentors from the Pharo community. Join dedicated channels, #gsoc-students for general interactions with students on Pharo slack. In order to get an invitation for pharoproject.slack.com visit the here Discuss with mentors about the complexity and skills required for the different projects. Please help fix bugs, open relevant issues, suggest changes, additional features, help build a roadmap, and interact with mentors on mailing list and/or slack to get a better insight into projects. Better the contributions, Better are the chances of selection. Before applying: Knowledge about OOP Basic idea about Pharo & Smalltalk syntax and ongoing projects Past experience with Pharo & Smalltalk Interaction with organisation You can start with the Pharo MOOC: http://files.pharo.org/mooc/
    1. Corporate thought leaders have now realized that it is a much greater challenge to actually apply that data. The big takeaways in this topic are that data has to be seen to be acknowledged, tangible to be appreciated, and relevantly presented to have an impact. Connecting data on the macro level across an organization and then bringing it down to the individual stakeholder on the micro level seems to be the key in getting past the fact that right now big data is one thing to have and quite another to unlock.

      Simply possessing pools of data is of limited utility. It's like having a space ship but your only access point to it is through a pin hole in the garage wall that lets you see one small, random glint of ship; you (think you) know there's something awesome inside but that sense is really all you've got. Margaret points out that it has to be seen (data visualization), it has to be tangible (relevant to audience) and connected at micro and macro levels (storytelling). For all of the machine learning and AI that helps us access the spaceship, these key points are (for now) human-driven.

    1. wanted there to be a continuum, a narrative,1:33that tracks the history of people1:37disseminating, collecting, sharing data.

      Back to the question: Can data help tell the story? or does it obscure the humanity of the narrative?

    2. ail system and the telegraph1:08and made Council Bluffs an enduring anchor1:13of the sharing of information.

      The history of data points is on the ground, first, and then in the air, and then in the wires, and now, in the wireless.

    1. The Justice Department has announced charges against four people, including two Russian security officials, over cybercrimes linked to a massive hack of millions of Yahoo user accounts. [500M accounts, in 2014]

      Two of the defendants — Dmitry Dokuchaev and his superior Igor Sushchin — are officers of the Russian Federal Security Service, or FSB. According to court documents, they "protected, directed, facilitated and paid" two criminal hackers, Alexsey Belan and Karim Baratov, to access information that has intelligence value. Belan also allegedly used the information obtained for his personal financial gain.

    1. Prophet : Facebook에서 오픈 소스로 공개한 시계열 데이터의 예측 도구로 R과 Python으로 작성되었다.

      python statics opensource, also can use R

    1. Either we own political technologies, or they will own us. The great potential of big data, big analysis and online forums will be used by us or against us. We must move fast to beat the billionaires.
    1. You can delete the data. You can limit its collection. You can restrict who sees it. You can inform students. You can encourage students to resist. Students have always resisted school surveillance.

      The first three of these can be tough for the individual faculty member to accomplish, but informing students and raising awareness around these issues can be done and is essential.

    1. with the publication of the “New Oxford Shakespeare”, they have shaped the debate about authorship in Elizabethan England. 

      Interesting how the technology improves.

    1. In addition, Neylon suggested that some low-level TDM goes on below the radar. ‘Text and data miners at universities often have to hide their location to avoid auto cut-offs of traditional publishers. This makes them harder to track. It’s difficult to draw the line between what’s text mining and what’s for researchers’ own use, for example, putting large volumes of papers into Mendeley or Zotero,’ he explained.

      Without a clear understanding of what a reference managers can do and what text and data mining is, it seems that some publishers will block the download of fulltexts on their platforms.

  8. Feb 2017
    1. A company that sells internet-connected teddy bears that allow kids and their far-away parents to exchange heartfelt messages left more than 800,000 customer credentials, as well as two million message recordings, totally exposed online for anyone to see and listen.

    1. Between 2013 and 2015 we accepted fewer than 25% of the total number of applications we received

      I'd love to see some stats on what the most common reasons for rejection are. Show me the data!

    1. Not in the right major. Not in the right class. Not in the right school. Not in the right country.

      There's a bit of a slippery slope here, no? Maybe it's Audrey on that slope, maybe it's data-happy schools/companies. In either case, I wonder if it might be productive to lay claim to some space on that slope, short of the dangers below, aware of them, and working to responsibly leverage machine intelligence alongside human understanding.

    2. All along the way, or perhaps somewhere along the way, we have confused surveillance for care. And that’s my takeaway for folks here today: when you work for a company or an institution that collects or trades data, you’re making it easy to surveil people and the stakes are high. They’re always high for the most vulnerable. By collecting so much data, you’re making it easy to discipline people. You’re making it easy to control people. You’re putting people at risk. You’re putting students at risk.
    3. Ed-Tech in a Time of Trump
    1. in order to facilitate advisors holding more productive conversations about potential academic directions with their advisees.

      Conversations!

    2. Each morning, all alerts triggered over the previous day are automatically sent to the advisor assigned to the impacted students, with a goal of advisor outreach to the student within 24 hours.

      Key that there's still a human and human relationships in the equation here.

    3. A single screen for each student offers all of the information that advisors reported was most essential to their work,

      Did students have access to the same data?

    4. and Georgia State's IT and legal offices readily accepted the security protocols put in place by EAB to protect the student data.

      So it's not as if this was done willy-nilly.

    1. In his spare time, the documentary photographer had been scraping information on Airbnb listings across the city and displaying them in interactive maps on his website, InsideAirbnb.com.

      Quite an undertaking!

    1. After a brief training session, participants spent six hours archiving environmental data from government websites, including those of the National Oceanic and Atmospheric Administration and the Interior Department.

      A worthwhile effort.

    2. An anonymous donor has provided storage on Amazon servers, and the information can be searched from a website at the University of Pennsylvania called Data Refuge. Though the Federal Records Act theoretically protects government data from deletion, scientists who rely on it say would rather be safe than sorry.

      Data refuge.

    1. In the node-list format, the first node in each row is ego, and the remaining nodes in that row are the nodes to which ego is connected (alters).

      Please don't do this!

  9. Jan 2017
    1. prospective interviewee,

      Just a side spiel: In terms of an interviewee and data, everything really is data. I'll be interviewing freshmen next semester with other SLU students and some things I have already told the group to take note of in notebooks (ha ha) are the different responses the interviewee gives. In a way, the sad little freshmen turn into our experiment. Everyone in the group records a different response. These responses include the obvious oral responses, body language, and tone of voice.

    1. Thousands of poorly secured MongoDB databases have been deleted by attackers recently. The attackers offer to restore the data in exchange for a ransom -- but they may not actually have a copy.

  10. Dec 2016
    1. evidence about obtaining higher productivity by using Agile methods

      If higher productivity came from including stakeholders in the frequent development releases, running a complementary scrum team on UX analysis should lead to improvement in quality.

    1. ‘In the past, if you were an alcohol distiller, you could throw up your hands and say, look, I don’t know who’s an alcoholic,’ he said. ‘Today, Facebook knows how much you’re checking Facebook. Twitter knows how much you’re checking Twitter. Gaming companies know how much you’re using their free-to-play games. If these companies wanted to do something, they could.’
    2. sites such as Facebook and Twitter automatically and continuously refresh the page; it’s impossible to get to the bottom of the feed.

      Well is not. A scrapping web technique used for the Data Selfies project goes to the end of the scrolling page for Twitter (after almost scrolling 3k tweets), which is useful for certain valid users of scrapping (like overwatch of political discourse on twitter).

      So, can be infinite scrolling be useful, but not allowed by default on this social networks. Could we change the way information is visualized to get an overview of it instead of being focused on small details all the time in an infitite scroll tread mill.

    1. Smalltalk doesn’t have to be pragmatic, because it’s better than its imitators and the things that make it different are also the things that give it an advantage.
    1. Preserving

      Really love the proposal overall and look forward to seeing what comes of the project(s).

      One slight thing I'd like to mention here, in the interest of furthering the critical diversity is that, in addition to our need to preserve data/archives, I'm increasingly being persuaded of the need to construct data prevention policies and techniques that would allow many people--protestors, youth, citizens, hospital patients, insurance beneficiaries, et al--much needed space to present clean-ish slates.

  11. Nov 2016
  12. Oct 2016
    1. A large database of blood donors' personal information from the AU Red Cross was posted on a web server with directory browsing enabled, and discovered by someone scanning randomly. It is unknown whether anyone else downloaded the file before it was removed.

    1. My hope is that the book I’ve written gives people the courage to realize that this isn’t really about math at all, it’s about power.
    1. because of the judiciary’s concern that such data could be used to single out judges, who were freed from restrictive sentencing guidelines in 2005

      so why is everyone talking about getting rid of mandatory minimums? This makes it sounds like they've already been gotten rid of

    1. Outside of the classroom, universities can use connected devices to monitor their students, staff, and resources and equipment at a reduced operating cost, which saves everyone money.
    2. Devices connected to the cloud allow professors to gather data on their students and then determine which ones need the most individual attention and care.
    1. (courses.csail.mit.edu/18.337/2015/docs/50YearsDataScience.pdf)

      nice reference !

    1. For G Suite users in primary/secondary (K-12) schools, Google does not use any user personal information (or any information associated with a Google Account) to target ads.

      In other words, Google does use everyone’s information (Data as New Oil) and can use such things to target ads in Higher Education.

  13. Sep 2016
    1. But ultimately you have to stop being meta. As Jeff Kaufman — a developer in Cambridge who's famous among effective altruists for, along with his wife Julia Wise, donating half their household's income to effective charities — argued in a talk about why global poverty should be a major focus, if you take meta-charity too far, you get a movement that's really good at expanding itself but not necessarily good at actually helping people.

      "Stop being meta" could be applied in some sense to meta systems like Smalltalk and Lisp, because their tendency to develop meta tools used mostly by developers, instead of "tools" used by by mostly everyone else. Burring the distinction between "everyone else" and developers in their ability to build/use meta tools, means to deliver tools and practices that can be a bridge with meta-tools. This is something we're trying to do with Grafoscopio and the Data Week.

    Tags

    Annotators

    1. (Crazy app uptake + riding data + math wizardry = many surprises in store.)

      Like Waze for public transit? Way to merge official Open Data from municipal authorities with the power of crowdsourcing mass transportation.

    1.  all  intellectual  property  rights,  shall  remain  the  exclusive  property  of  the  [School/District],

      This is definitely not the case. Even in private groups would it ever make sense to say this?

    2. Access

      This really just extends the issue of "transfer" mentioned in 9.

    3. Data  Transfer  or  Destruction

      This is the first line item I don't feel like we have a proper contingency for or understand exactly how we would handle it.

      It seems important to address not just due to FERPA but to contracts/collaborations like that we have with eLife:

      What if eLife decides to drop h. Would we, could we delete all data/content related to their work with h? Even outside of contract termination, would we/could we transfer all their data back to them?

      The problems for our current relationship with schools is that we don't have institutional accounts whereby we might at least technically be able to collect all related data.

      Students could be signing up for h with personal email addresses.

      They could be using their h account outside of school so that their data isn't fully in the purview of the school.

      Question: if AISD starts using h on a big scale, 1) would we delete all AISD related data if they asked--say everything related to a certain email domain? 2) would we share all that data with them if they asked?

    4. Data  cannot  be  shared  with  any  additional  parties  without  prior  written  consent  of  the  Userexcept  as  required  by  law.”

      Something like this should probably be added to our PP.

    5. Data  Collection

      I'm really pleased with how hypothes.is addresses the issues on this page in our Privacy Policy.

    6. There  is  nothing  wrong  with  a  provider  usingde-­‐identified  data  for  other  purposes;  privacy  statutes,  after  all,  govern  PII,  not  de-­‐identified  data.

      Key point.

    1. Application Modern higher education institutions have unprecedentedly large and detailed collections of data about their students, and are growing increasingly sophisticated in their ability to merge datasets from diverse sources. As a result, institutions have great opportunities to analyze and intervene on student performance and student learning. While there are many potential applications of student data analysis in the institutional context, we focus here on four approaches that cover a broad range of the most common activities: data-based enrollment management, admissions, and financial aid decisions; analytics to inform broad-based program or policy changes related to retention; early-alert systems focused on successful degree completion; and adaptive courseware.

      Perhaps even more than other sections, this one recalls the trope:

      The difference probably comes from the impact of (institutional) “application”.

    2. the risk of re-identification increases by virtue of having more data points on students from multiple contexts

      Very important to keep in mind. Not only do we realise that re-identification is a risk, but this risk is exacerbated by the increase in “triangulation”. Hence some discussions about Differential Privacy.

    3. the automatic collection of students’ data through interactions with educational technologies as a part of their established and expected learning experiences raises new questions about the timing and content of student consent that were not relevant when such data collection required special procedures that extended beyond students’ regular educational experiences of students

      Useful reminder. Sounds a bit like “now that we have easier access to data, we have to be particularly careful”. Probably not the first reflex of most researchers before they start sending forms to their IRBs. Important for this to be explicitly designated as a concern, in IRBs.

    4. Responsible Use

      Again, this is probably a more felicitous wording than “privacy protection”. Sure, it takes as a given that some use of data is desirable. And the preceding section makes it sound like Learning Analytics advocates mostly need ammun… arguments to push their agenda. Still, the notion that we want to advocate for responsible use is more likely to find common ground than this notion that there’s a “data faucet” that should be switched on or off depending on certain stakeholders’ needs. After all, there exists a set of data use practices which are either uncontroversial or, at least, accepted as “par for the course” (no pun intended). For instance, we probably all assume that a registrar should receive the grade data needed to grant degrees and we understand that such data would come from other sources (say, a learning management system or a student information system).

    5. Data sharing over open-source platforms can create ambiguous rules about data ownership and publication authorship, or raise concerns about data misuse by others, thus discouraging liberal sharing of data.

      Surprising mention of “open-source platforms”, here. Doesn’t sound like these issues are absent from proprietary platforms. Maybe they mean non-institutional platforms (say, social media), where these issues are really pressing. But the wording is quite strange if that is the case.

    6. captures values such as transparency and student autonomy

      Indeed. “Privacy” makes it sound like a single factor, hiding the complexity of the matter and the importance of learners’ agency.

    7. Activities such as time spent on task and discussion board interactions are at the forefront of research.

      Really? These aren’t uncontroversial, to say the least. For instance, discussion board interactions often call for careful, mixed-method work with an eye to preventing instructor effect and confirmation bias. “Time on task” is almost a codeword for distinctions between models of learning. Research in cognitive science gives very nuanced value to “time spent on task” while the Malcolm Gladwells of the world usurp some research results. A major insight behind Competency-Based Education is that it can allow for some variance in terms of “time on task”. So it’s kind of surprising that this summary puts those two things to the fore.

    8. Research: Student data are used to conduct empirical studies designed primarily to advance knowledge in the field, though with the potential to influence institutional practices and interventions. Application: Student data are used to inform changes in institutional practices, programs, or policies, in order to improve student learning and support. Representation: Student data are used to report on the educational experiences and achievements of students to internal and external audiences, in ways that are more extensive and nuanced than the traditional transcript.

      Ha! The Chronicle’s summary framed these categories somewhat differently. Interesting. To me, the “application” part is really about student retention. But maybe that’s a bit of a cynical reading, based on an over-emphasis in the Learning Analytics sphere towards teleological, linear, and insular models of learning. Then, the “representation” part sounds closer to UDL than to learner-driven microcredentials. Both approaches are really interesting and chances are that the report brings them together. Finally, the Chronicle made it sound as though the research implied here were less directed. The mention that it has “the potential to influence institutional practices and interventions” may be strategic, as applied research meant to influence “decision-makers” is more likely to sway them than the type of exploratory research we so badly need.

    1. often private companies whose technologies power the systems universities use for predictive analytics and adaptive courseware
    2. the use of data in scholarly research about student learning; the use of data in systems like the admissions process or predictive-analytics programs that colleges use to spot students who should be referred to an academic counselor; and the ways colleges should treat nontraditional transcript data, alternative credentials, and other forms of documentation about students’ activities, such as badges, that recognize them for nonacademic skills.

      Useful breakdown. Research, predictive models, and recognition are quite distinct from one another and the approaches to data that they imply are quite different. In a way, the “personalized learning” model at the core of the second topic is close to the Big Data attitude (collect all the things and sense will come through eventually) with corresponding ethical problems. Through projects vary greatly, research has a much more solid base in both ethics and epistemology than the kind of Big Data approach used by technocentric outlets. The part about recognition, though, opens the most interesting door. Microcredentials and badges are a part of a broader picture. The data shared in those cases need not be so comprehensive and learners have a lot of agency in the matter. In fact, when then-Ashoka Charles Tsai interviewed Mozilla executive director Mark Surman about badges, the message was quite clear: badges are a way to rethink education as a learner-driven “create your own path” adventure. The contrast between the three models reveals a lot. From the abstract world of research, to the top-down models of Minority Report-style predictive educating, all the way to a form of heutagogy. Lots to chew on.

    1. “We need much more honesty, about what data is being collected and about the inferences that they’re going to make about people. We need to be able to ask the university ‘What do you think you know about me?’”
    1. The importance of models may need to be underscored in this age of “big data” and “data mining”. Data, no matter how big, can only tell you what happened in the past. Unless you’re a historian, you actually care about the future — what will happen, what could happen, what would happen if you did this or that. Exploring these questions will always require models. Let’s get over “big data” — it’s time for “big modeling”.
    2. Readers are thus encouraged to examine and critique the model. If they disagree, they can modify it into a competing model with their own preferred assumptions, and use it to argue for their position. Model-driven material can be used as grounds for an informed debate about assumptions and tradeoffs. Modeling leads naturally from the particular to the general. Instead of seeing an individual proposal as “right or wrong”, “bad or good”, people can see it as one point in a large space of possibilities. By exploring the model, they come to understand the landscape of that space, and are in a position to invent better ideas for all the proposals to come. Model-driven material can serve as a kind of enhanced imagination.

      This is a part where my previous comments on data activism data journalism (see 1,2 & 3) and more plural computing environments for engagement of concerned citizens on the important issues of our time could intersect with Victor's discourse.

    3. The Gamma: Programming tools for data journalism

      (b) languages for novices or end-users, [...] If we can provide our climate scientists and energy engineers with a civilized computing environment, I believe it will make a very significant difference.

      But data journalists, and in fact, data activist, social scientist, and so on, could be a "different type of novice", one that is more critically and politically involved (in the broader sense of the "politic" word).

      The wider dialogue on important matters that is mediated, backed up and understood by dealing data, (as climate change) requires more voices that the ones are involved today, and because they need to be reason and argument using data, we need to go beyond climate scientist or energy engeeners as the only ones who need a "civilized computing environment" to participate in the important complex and urgent matters of today world. Previously, these more critical voices (activists, journalists, scientists) have helped to make policy makers accountable and more sensible on other important and urgent issues.

      In that sense my work with reproducible research in my Panama Papers as a prototype of a data continuum environment, or others, like Gamma, could serve as an exploration, invitation and early implementation of what is possible to enrich this data/computing enhanced dialogue.

    4. I say this despite the fact that my own work has been in much the opposite direction as Julia. Julia inherits the textual interaction of classic Matlab, SciPy and other children of the teletype — source code and command lines.

      The idea of a tradition technologies which are "children of teletype" is related to the comparison we do in the data week workshop/hackathon. In our case we talk about "unix fathers" versus "dynabook children" and bifurcation/recombination points of this technologies:

    5. If efficiency incentives and tools have been effective for utilities, manufacturers, and designers, what about for end users? One concern I’ve always had is that most people have no idea where their energy goes, so any attempt to conserve is like optimizing a program without a profiler.
    6. The catalyst for such a scale-up will necessarily be political. But even with political will, it can’t happen without technology that’s capable of scaling, and economically viable at scale. As technologists, that’s where we come in.

      May be we come before, by enabling this conversation (as said previously). Political agenda is currently coopted by economical interests far away of a sustainable planet or common good. Feedback loops can be a place to insert counter-hegemonic discourse to enable a more plural and rational dialogue between civil society and goverment, beyond short term economic current interest/incumbents.

    7. This is aimed at people in the tech industry, and is more about what you can do with your career than at a hackathon. I’m not going to discuss policy and regulation, although they’re no less important than technological innovation. A good way to think about it, via Saul Griffith, is that it’s the role of technologists to create options for policy-makers.

      Nice to see this conversation happening between technology and broader socio-political problems so explicit in Bret's discourse.

      What we're doing in fact is enabling this conversation between technologist and policy-makers first, and we're highlighting it via hackathon/workshops, but not reducing it only to what happens there (an interesting critique to the techno-solutionism hackathon is here), using the feedback loops in social networks, but with an intention of mobilizing a setup that goes beyond. One example is our twitter data selfies (picture/link below). The necesity of addressing urgent problem that involve techno-socio-political complex entanglements is more felt in the Global South.

      ^ Up | Twitter data selfies: a strategy to increase the dialog between technologist/hackers and policy makers (click here for details).

  14. Aug 2016
    1. DATA GOVERNANCE

      la Data Governance fa pensare ad una Pubblica Amministrazione come unico organismo pensante e decisorio. Un concetto facile da metabolizzare, ma che non rispecchia spesso l'architettura reale delle PA di grandi dimensioni come i Comuni capoluogo, ad esempio.

      La Data Governance parte da una PA che ha progettato o implementato la sua piattaforma informatica di 1) gestione dei flussi di lavoro interni e 2) gestione di servizi erogati all'utenza, in maniera tale da eliminare totalmente l'uso del supporto cartaceo e da permettere esclusivamente il data entry sia internamente dagli uffici che dall'utenza che richiede servizi pubblici agli enti pubblici. La Data Governance può essere adeguatamente ed efficacemente attuata solo se nella PA si tiene conto di questi elementi anzidetti. In merito colgo l'occasione per citare le 7 piattaforme ICT che le 14 grandi città metropolitane italiane devono realizzare nel contesto del PON METRO. Ecco questa si presenta come un occasione per le 14 grandi città italiane di dotarsi della stessa DATA GOVERNANCE, visto che le 7 piattaforme ICT devono (requisito) essere interoperabili tra loro. La Data Governance si crea insieme alla progettazione delle piattaforme informatiche che permettono alla PA di "funzionare" nei territori. La Data Governance è indissolubilmente legata al "data entry". Il data entry non prevede scansioni di carta o gestione di formati di lavoro non aperti. La Data Governance nelle sue procedure operative quotidiana è alla base della politica open data di qualità. Una Data Governance della PA nel 2016-17-... non può ancora fondarsi nella costruzione manuale del formato CSV e relativa pubblicazione manuale ad opera del dipendente pubblico. Una Data Governance dovrebbe tenere in considerazione che le procedure di pubblicazione dei dataset devono essere automatiche e derivanti dalle funzionalità degli stessi applicativi gestionali (piattaforme ICT) in uso nella PA, senza alcun intervento umano se non nella fase di filtraggio/oscuramento dei dati che afferiscono alla privacy degli individui.

    1. Credibull score = 9.60 / 10

      To provide feedback on the score fill in the form available here

      What is Credibull? getcredibull.com

    1. Page 122

      Borgman on terms used by the humanities and social sciences to describe data and other types of analysis

      humanist and social scientists frequently distinguish between primary and secondary information based on the degree of analysis. Yet this ordering sometimes conflates data, sources, and resources, as exemplified by a report that distinguishes "primary resources, E. G., Books close quotation from quotation secondary resources, eat. Gee., Catalogs close quotation . Resources also categorized as primary or sensor data, numerical data, and field notebooks, all of which would be considered data in the sciences. Rarely would books, conference proceedings, and feces that the report categorizes as primary resources be considered data, except when used for text-or data-mining purposes. Catalogs, subject indices, citation indexes, search engines, and web portals were classified as secondary resources. These are typically viewed as tertiary resources in the library community because they describe primary and secondary resources. The distinctions between data, sources, and resources very by discipline and circumstance. For the purposes of this book, primary resources are data, secondary resources are reports of research, whether publications or intern forms, and tertiary resources are catalogs, indexes, and directories that provide access to primary and secondary resources. Sources are the origins of these resources.

    2. Page XVIII

      Borgman notes that no social framework exist for data that is comparable to this framework that exist for analysis. CF. Kitchen 2014 who argues that pre-big data, we privileged analysis over data to the point that we threw away the data after words . This is what creates the holes in our archives.

      He wonders capabilities [of the data management] must be compared to the remarkably stable scholarly communication system in which they exist. The reward system continues to be based on publishing journal articles, books, and conference papers. Peer-reviewed legitimizes scholarly work. Competition and cooperation are carefully balanced. The means by which scholarly publishing occurs is an unstable state, but the basic functions remained relatively unchanged. while capturing and managing the "data deluge" is a major driver of the scholarly infrastructure developments, no Showshow same framework for data exist that is comparable to that for publishing.

  15. Jul 2016
    1. Page 220

      Humanistic research takes place in a rich milieu that incorporates the cultural context of artifacts. Electronic text and models change the nature of scholarship in subtle and important ways, which have been discussed at great length since the humanities first began to contemplate the scholarly application of computing.

    2. Page 217

      Methods for organizing information in the humanities follow from their research practices. Humanists fo not rely on subject indexing to locate material to the extent that the social sciences or sciences do. They are more likely to be searching for new interpretations that are not easily described in advance; the journey through texts, libraries, and archives often is the research.

    3. Page 223

      Borgman is discussing here the difference in the way humanists handle data in comparison to the way that scientists and social scientist:

      When generating their own data such as interviews or observations, human efforts to describe and represent data are comparable to that of scholars and other disciplines. Often humanists are working with materials already described by the originator or holder of the records, such as libraries, archives, government agencies, or other entities. Whether or not the desired content already is described as data, scholars need to explain its evidentiary value in your own words. That report often becomes part of the final product. While scholarly publications in all fields set data within a context, the context and interpretation are scholarship in the humanities.

    4. Pages 220-221

      Digital Humanities projects result in two general types of products. Digital libraries arise from scholarly collaborations and the initiatives of cultural heritage institutions to digitize their sources. These collections are popular for research and education. … The other general category of digital humanities products consist of assemblages of digitized cultural objects with associated analyses and interpretations. These are the equivalent of digital books in that they present an integrated research story, but they are much more, as they often include interactive components and direct links to the original sources on which the scholarship is based. … Projects that integrate digital records for widely scattered objects are a mix of a digital library and an assemblage.

    5. Page 219

      In the humanities, it is difficult to separate artifacts from practices or publications from data.

    6. Page 219

      Humanities scholars integrate and aggregate data from many sources. They need tools and services to analyze digital data, as others do the sciences and social sciences, but also tools that assist them interpretation and contemplation.

    7. Page 215

      What seems a clear line between publications and data in the sciences and social sciences is a decidedly fuzzy one in the humanities. Publications and other documents are central sources of data to humanists. … Data sources for the humanities are innumerable. Almost any document, physical artifact, or record of human activity can be used to study culture. Humanities scholars value new approaches, and recognizing something as a source of data (e.g., high school yearbooks, cookbooks, or wear patterns in the floor of public places) can be an act of scholarship. Discovering heretofore unknown treasures buried in the world's archives is particularly newsworthy. … It is impossible to inventory, much less digitize, all the data that might be useful scholarship communities. Also distinctive about humanities data is their dispersion and separation from context. Cultural artifacts are bought and sold, looted in wars, and relocated to museums and private collections. International agreements on the repatriation of cultural objects now prevent many items from being exported, but items that were exported decades or centuries ago are unlikely to return to their original site. … Digitizing cultural records and artifacts make them more malleable and mutable, which creates interesting possibilities for analyzing, contextualizing, and recombining objects. Yet digitizing objects separates them from the origins, exacerbating humanists’ problems in maintaining the context. Removing text from its physical embodiment in a fixed object may delete features that are important to researchers, such as line and page breaks, fonts, illustrations, choices of paper, bindings, and marginalia. Scholars frequently would like to compare such features in multiple additions or copies.

    8. Page 214

      Borgman on information artifacts and communities:

      Artifacts in the humanities differ from those of the sciences and social sciences in several respects. Humanist use the largest array of information sources, and as a consequence, the station between documents and data is the least clear. They also have a greater number of audiences for the data and the products of the research. Whereas scientific findings usually must be translated for a general audience, humanities findings often are directly accessible and of immediate interest to the general public.

    9. Page 204

      Borgman on the different types of data in the social sciences:

      Data in the social sciences fall into two general categories. The first is data collected by researchers through experiments, interviews, surveys, observations, or similar names, analogous to scientific methods. … the second category is data collected by other people or institutions, usually for purposes other than research.

    10. Page 202

      Borgman on information artifacts in the social sciences

      like the sciences, the social sciences create and use minimal information. Yet they differ in the sources of the data. While almost all scientific data are created by for scientific purposes, a significant portion of social scientific data consists of records credit for other purposes, by other parties.

    11. Borgman, Christine L. 2007. Scholarship in the Digital Age: Information, Infrastructure, and the Internet. Cambridge, Mass: MIT Press.

      My notes

    12. Page 147

      Borgman on the challenges facing the humanities in the age of Big Data:

      Text and data mining offer similar Grand challenges in the humanities and social sciences. Gregory crane provide some answers to the question what do you do with a million books? Two obvious answers include the extraction of information about people, places, and events, and machine translation between languages. As digital libraries of books grow through scanning avert such as Google print, the open content Alliance, million books project, and comparable projects in Europe and China, and as more books are published in digital form technical advances in data description, and now it says, and verification are essential. These large collections differ from earlier, smaller after it's on several Dimensions. They are much larger in scale, the content is more heterogenous in topic and language, the granularity creases when individual words can be tagged and they were noisy then there well curated predecessors, and their audiences more diverse, reaching the general public in addition to the scholarly community. Computer scientists are working jointly with humanist, language, and other demands specialist to pars tax, extract named entities in places, I meant optical character recognition techniques counter and Advance the state of art of information retrieval.

    13. Page 137

      Borgman discusses hear the case of NASA which lost the original video recording of the first moon landing in 1969. Backups exist, apparently, but they are lower quality than the originals.

    14. Page 122

      Here Borgman suggest that there is some confusion or lack of overlap between the words that humanist and social scientists use in distinguishing types of information from the language used to describe data.

      Humanist and social scientists frequently distinguish between primary and secondary information based on the degree of analysis. Yet this ordering sometimes conflates data sources, and resorces, as exemplified by a report that distinguishes quote primary resources, ed books quote from quote secondary resources, Ed catalogs quote. Resorts is also categorized as primary wear sensor data AMA numerical data and filled notebooks, all of which would be considered data in The Sciences. But rarely would book cover conference proceedings, and he sees that the report categorizes as primary resources be considered data, except when used for text or data mining purposes. Catalogs, subject indices, citation index is, search engines, and web portals were classified as secondary resources.

    1. Page 14

      Rockwell and Sinclair note that corporations are mining text including our email; as they say here:

      more and more of our private textual correspondence is available for large-scale analysis and interpretation. We need to learn more about these methods to be able to think through the ethical, social, and political consequences. The humanities have traditions of engaging with issues of literacy, and big data should be not an exception. How to analyze interpret, and exploit big data are big problems for the humanities.

    2. Page 14

      Rockwell and Sinclair note that HTML and PDF documents account for 17.8% and 9.2% of (I think) all data on the web while images and movies account for 23.2% and 4.3%.