3,441 Matching Annotations
  1. Mar 2020
    1. Data has become a “natural resource” for advertising technology. “And, just as with every other precious resource, we all bear responsibility for its consumption,”
    2. They can form a new company that handles all operations within the EU but nowhere else. This subsidiary company can license and segregate European data from the parent company,
    3. To join the Privacy Shield Framework, a U.S.-based organization is required to self-certify to the Department of Commerce and publicly commit to comply with the Framework’s requirements. While joining the Privacy Shield is voluntary, the GDPR goes far beyond it.
    1. This protects you from government overreach, and protects your privacy, and makes identity theft a lot harder.
    2. For example, personal information. Each governmental ministry might have any amount of information on a given voter. Finance knows about your tax, health knows about your medicare claims, justice knows about your criminal record, the RTA the status of your drivers licence. But they don’t know what the OTHER ministries know about you. The data is segregated. Just because one group of people knows about a given part of your life or existence, doesn’t mean that they can get access to everything else. You, by design, do not have one single file that has everything on you. The data is segregated.
    1. Users have the right to obtain (in a machine readable format) their personal data for the purpose of transferring it from one controller to another, without being prevented from doing so by the data processor.
    2. Users have the right to access their personal data and information about how their personal data is being processed. If the user requests it, data controllers must provide an overview of the categories of data being processed, a copy of the actual data and details about the processing. The details should include the purpose, how the data was acquired and with whom it was shared.
    3. The Right to access is closely linked to the Right to data portability, but these two rights are not the same.
    1. Data privacy now a global movementWe’re pleased to say we’re not the only ones who share this philosophy, web browsing companies like Brave have made it possible so you can browse the internet more privately; and you can use a search engine like DuckDuckGo to search with the freedom you deserve.
    2. our values remain the same – advocating for 100% data ownership, respecting user-privacy, being reliable and encouraging people to stay secure. Complete analytics, that’s 100% yours.
    3. Due to that, you have 100% data ownership as Matomo is hosted on your own servers and we have absolutely no way of gaining access to your data.

      Technically impossible for them to get your data if the data doesn't pass through them at all.

    4. when you choose Matomo Cloud, we acknowledge in our Terms that you own all rights, titles, and interest to your users’ data. We obtain no rights from you to your users data. This means we can’t on-sell it to third parties, we can’t claim ownership of it, you can export your data anytime

      Technically impossible for them to sell your data if the data doesn't pass through them at all.

    5. the privacy of your users is respected
    6. The relationship is between the website owner (you) and the visitor, with no external sources looking in
  2. www.graphitedocs.com www.graphitedocs.com
    1. Own Your Encryption KeysYou would never trust a company to keep a record of your password for use anytime they want. Why would you do that with your encryption keys? With Graphite, you don't have to. You own and manage your keys so only YOU can decrypt your content.
    1. it would appear impossible to require a publisher to provide information on and obtain consent for the installation of cookies on his own website also with regard to those installed by “third parties**”
    2. Our solution goes a bit further than this by pointing to the browser options, third-party tools and by linking to the third party providers, who are ultimately responsible for managing the opt-out for their own tracking tools.
    3. You are also not required to manage consent for third-party cookies directly on your site/app as this responsibility falls to the individual third-parties. You are, however, required to at least facilitate the process by linking to the relevant policies of these third-parties.
    4. the publisher would be required to check, from time to time, that what is declared by the third parties corresponds to the purposes they are actually aiming at via their cookies. This is a daunting task because a publisher often has no direct contacts with all the third parties installing cookies via his website, nor does he/she know the logic underlying the respective processing.
    5. When you think about data law and privacy legislations, cookies easily come to mind as they’re directly related to both. This often leads to the common misconception that the Cookie Law (ePrivacy directive) has been repealed by the General Data Protection Regulation (GDPR), which in fact, it has not. Instead, you can instead think of the ePrivacy Directive and GDPR as working together and complementing each other, where, in the case of cookies, the ePrivacy generally takes precedence.
    1. Decision point #2 – Do you send any data to third parties, directly or inadvertently? <img class="alignnone size-full wp-image-10174" src="https://www.jeffalytics.com/wp-content/uploads/7deb832d95678dc21cc23208d76f4144_Flowchart.png" alt="GDPR cookie consent flowchart" width="1451" height="601" srcset="https://www.jeffalytics.com/wp-content/uploads/7deb832d95678dc21cc23208d76f4144_Flowchart.png 1451w, https://www.jeffalytics.com/wp-content/uploads/7deb832d95678dc21cc23208d76f4144_Flowchart-300x124.png 300w, https://www.jeffalytics.com/wp-content/uploads/7deb832d95678dc21cc23208d76f4144_Flowchart-981x406.png 981w, https://www.jeffalytics.com/wp-content/uploads/7deb832d95678dc21cc23208d76f4144_Flowchart-761x315.png 761w, https://www.jeffalytics.com/wp-content/uploads/7deb832d95678dc21cc23208d76f4144_Flowchart-611x253.png 611w, https://www.jeffalytics.com/wp-content/uploads/7deb832d95678dc21cc23208d76f4144_Flowchart-386x160.png 386w, https://www.jeffalytics.com/wp-content/uploads/7deb832d95678dc21cc23208d76f4144_Flowchart-283x117.png 283w, https://www.jeffalytics.com/wp-content/uploads/7deb832d95678dc21cc23208d76f4144_Flowchart-600x249.png 600w, https://www.jeffalytics.com/wp-content/uploads/7deb832d95678dc21cc23208d76f4144_Flowchart-1024x424.png 1024w, https://www.jeffalytics.com/wp-content/uploads/7deb832d95678dc21cc23208d76f4144_Flowchart-50x21.png 50w, https://www.jeffalytics.com/wp-content/uploads/7deb832d95678dc21cc23208d76f4144_Flowchart-250x104.png 250w, https://www.jeffalytics.com/wp-content/uploads/7deb832d95678dc21cc23208d76f4144_Flowchart-241x100.png 241w, https://www.jeffalytics.com/wp-content/uploads/7deb832d95678dc21cc23208d76f4144_Flowchart-400x166.png 400w, https://www.jeffalytics.com/wp-content/uploads/7deb832d95678dc21cc23208d76f4144_Flowchart-350x145.png 350w, https://www.jeffalytics.com/wp-content/uploads/7deb832d95678dc21cc23208d76f4144_Flowchart-840x348.png 840w, https://www.jeffalytics.com/wp-content/uploads/7deb832d95678dc21cc23208d76f4144_Flowchart-860x356.png 860w, https://www.jeffalytics.com/wp-content/uploads/7deb832d95678dc21cc23208d76f4144_Flowchart-1030x427.png 1030w" sizes="(max-width: 1451px) 100vw, 1451px" /> Remember, inadvertently transmitting data to third parties can occur through the plugins you use on your website. You don't necessarily have to be doing this proactively. If the answer is “Yes,” then to comply with GDPR, you should use a cookie consent popup.
    1. You must clearly identify each party that may collect, receive, or use end users’ personal data as a consequence of your use of a Google product. You must also provide end users with prominent and easily accessible information about that party’s use of end users’ personal data.
    1. GDPR introduces a list of data subjects’ rights that should be obeyed by both data processors and data collectors. The list includes: Right of access by the data subject (Section 2, Article 15). Right to rectification (Section 3, Art 16). Right to object to processing (Section 4, Art 21). Right to erasure, also known as ‘right to be forgotten’ (Section 3, Art 17). Right to restrict processing (Section 3, Art 18). Right to data portability (Section 3, Art 20).
    1. An example of reliance on legitimate interests includes a computer store, using only the contact information provided by a customer in the context of a sale, serving that customer with direct regular mail marketing of similar product offerings — accompanied by an easy-to-select choice of online opt-out.
    1. This is no different where legitimate interests applies – see the examples below from the DPN. It should also be made clear that individuals have the right to object to processing of personal data on these grounds.
    2. Individuals can object to data processing for legitimate interests (Article 21 of the GDPR) with the controller getting the opportunity to defend themselves, whereas where the controller uses consent, individuals have the right to withdraw that consent and the ‘right to erasure’. The DPN observes that this may be a factor in whether companies rely on legitimate interests.

      .

    1. Legitimate Interest may be used for marketing purposes as long as it has a minimal impact on a data subject’s privacy and it is likely the data subject will not object to the processing or be surprised by it.
    1. 15. Data sharing: practices and needs of the transport research community In view of the growing importance of data sharing in transportresearch and based on recommendations of a dedicated expert group on the Transport Research Cloud a new study will establish:1. What are the needs and objections of transports researchers in relation to data sharing:2. What are the training requirementsneeded for the transport research community to facilitate data sharing and3. What potential user communities would expect from a Transport Research Cloud?Type of Action: Provision of technical/scientific services by the Joint Research CentreIndicative timetable: Second quarter of 2020Indicative budget: EUR 0.20 million from the 2020 budget
    1. multiple scandals have highlighted some very shady practices being enabled by consent-less data-mining — making both the risks and the erosion of users’ rights clear
    2. Earlier this year it began asking Europeans for consent to processing their selfies for facial recognition purposes — a highly controversial technology that regulatory intervention in the region had previously blocked. Yet now, as a consequence of Facebook’s confidence in crafting manipulative consent flows, it’s essentially figured out a way to circumvent EU citizens’ fundamental rights — by socially engineering Europeans to override their own best interests.
    3. The deceitful obfuscation of commercial intention certainly runs all the way through the data brokering and ad tech industries that sit behind much of the ‘free’ consumer Internet. Here consumers have plainly been kept in the dark so they cannot see and object to how their personal information is being handed around, sliced and diced, and used to try to manipulate them.
    4. design choices are being selected to be intentionally deceptive. To nudge the user to give up more than they realize. Or to agree to things they probably wouldn’t if they genuinely understood the decisions they were being pushed to make.
    1. sleep Student's Sleep Data

      Data which show the effect of two soporific drugs (increase in hours of sleep compared to control) on 10 patients.

    2. Pulse Pulse Rates and Exercise

      Students in a Stat2 class recorded resting pulse rates (in class), did three "laps" walking up/down a nearby set of stairs, and then measured their pulse rate after the exercise. They provided additional information about height, weight, exercise, and smoking habits via a survey.

    3. HSAUR agefat Total Body Composision Data

      Dataset used for KSB

    1. Consent is one of six lawful grounds for processing data. It may be arguable that anti-spam measures such as reCaptcha can fall under "legitimate interests" (ie you don't need to ask for consent)
    1. A 1% sample of AddThis Data (“Sample Dataset”) is retained for a maximum of 24 months for business continuity purposes.
    1. Not only are public transport datasets useful for benchmarking route planning systems, they are also highly useful for benchmarking geospatial [13, 14] and temporal [15, 16] RDF systems due to the intrinsic geospatial and temporal properties of public transport datasets. While synthetic dataset generators already exist in the geospatial and temporal domain [17, 18], no systems exist yet that focus on realism, and specifically look into the generation of public transport datasets. As such, the main topic that we address in this work, is solving the need for realistic public transport datasets with geospatial and temporal characteristics, so that they can be used to benchmark RDF data management and route planning systems. More specifically, we introduce a mimicking algorithm for generating realistic public transport data, which is the main contribution of this work.
    1. Right now, if you want to know what data Facebook has about you, you don’t have the right to ask them to give you all of the data they have on you, and the right to know what they’ve done with it. You should have that right. You should have the right to know and have access to your data.
    1. Good data privacy practices by companies are good for the world. We wake up every day excited to change the world and keep the internet that we all know and love safe and transparent.
    2. startup focused on creating transparency in data. All that stuff you keep reading about the shenanigans with companies mishandling people's data? That's what we are working on fixing.
  3. Feb 2020
    1. Research ethics concerns issues, such as privacy, anonymity, informed consent and the sensitivity of data. Given that social media is part of society’s tendency to liquefy and blur the boundaries between the private and the public, labour/leisure, production/consumption (Fuchs, 2015a: Chapter 8), research ethics in social media research is par-ticularly complex.
    2. One important aspect of critical social media research is the study of not just ideolo-gies of the Internet but also ideologies on the Internet. Critical discourse analysis and ideology critique as research method have only been applied in a limited manner to social media data. Majid KhosraviNik (2013) argues in this context that ‘critical dis-course analysis appears to have shied away from new media research in the bulk of its research’ (p. 292). Critical social media discourse analysis is a critical digital method for the study of how ideologies are expressed on social media in light of society’s power structures and contradictions that form the texts’ contexts.
    3. t has, for example, been common to study contemporary revolutions and protests (such as the 2011 Arab Spring) by collecting large amounts of tweets and analysing them. Such analyses can, however, tell us nothing about the degree to which activists use social and other media in protest communication, what their motivations are to use or not use social media, what their experiences have been, what problems they encounter in such uses and so on. If we only analyse big data, then the one-sided conclusion that con-temporary rebellions are Facebook and Twitter revolutions is often the logical conse-quence (see Aouragh, 2016; Gerbaudo, 2012). Digital methods do not outdate but require traditional methods in order to avoid the pitfall of digital positivism. Traditional socio-logical methods, such as semi-structured interviews, participant observation, surveys, content and critical discourse analysis, focus groups, experiments, creative methods, par-ticipatory action research, statistical analysis of secondary data and so on, have not lost importance. We do not just have to understand what people do on the Internet but also why they do it, what the broader implications are, and how power structures frame and shape online activities
    4. Challenging big data analytics as the mainstream of digital media studies requires us to think about theoretical (ontological), methodological (epistemological) and ethical dimensions of an alternative paradigm

      Making the case for the need for digitally native research methodologies.

    5. Who communicates what to whom on social media with what effects? It forgets users’ subjectivity, experiences, norms, values and interpre-tations, as well as the embeddedness of the media into society’s power structures and social struggles. We need a paradigm shift from administrative digital positivist big data analytics towards critical social media research. Critical social media research combines critical social media theory, critical digital methods and critical-realist social media research ethics.
    6. de-emphasis of philosophy, theory, critique and qualitative analysis advances what Paul Lazarsfeld (2004 [1941]) termed administrative research, research that is predominantly concerned with how to make technologies and administration more efficient and effective.
    7. Big data analytics’ trouble is that it often does not connect statistical and computational research results to a broader analysis of human meanings, interpretations, experiences, atti-tudes, moral values, ethical dilemmas, uses, contradictions and macro-sociological implica-tions of social media.
    8. Such funding initiatives privilege quantitative, com-putational approaches over qualitative, interpretative ones.
    9. There is a tendency in Internet Studies to engage with theory only on the micro- and middle-range levels that theorize single online phenomena but neglect the larger picture of society as a totality (Rice and Fuller, 2013). Such theories tend to be atomized. They just focus on single phenomena and miss soci-ety’s big picture
    1. Data extraction is a piece of a larger puzzle called data integration (getting the data you want to a single place, from different systems, the way you want it), which people have been working on since the early 1980s.

      definition

  4. Jan 2020
    1. Sociedad Quimica y Minera de Chile (NYSE:SQM): Known as SQM, this company also extracts potash from the Atacama Desert and is Chile’s largest fertilizer producer.

      Potassium mines in 3rd world countries.

    1. Annotation extends that power to a web made not only of linked resources, but also of linked segments within them. If the web is a loom on which applications are woven, then annotation increases the thread count of the fabric. Annotation-powered applications exploit the denser weave by defining segments and attaching data or behavior to them.

      I remember the first time I truly understood what Jon meant when he said this. One web page can have an unlimited number of specific addresses pointing into its parts--and through annotation these parts can be connected to an unlimited number of parts of other things. Jon called it: Exploding the web! How far we've come from Vannevar Bush's musings...

    1. The Web Annotation Data Model specification describes a structured model and format to enable annotations to be shared and reused across different hardware and software platforms.

      The publication of this web standard changed everything. I look forward to true testing of interoperable open annotation. The publication of the standard nearly three years ago was a game changer, but the game is still in progress. The future potential is unlimited!

    1. A final word: when we do not understand something, it does not look like there is anything to be understood at all - it just looks like random noise. Just because it looks like noise does not mean there is no hidden structure.

      Excellent statement! Could this be the guiding principle of the current big data boom in biology?

  5. Dec 2019
    1. You can create sub-projects (or sub-contexts) by adding a backslash
    2. Double click on a project/context select all there sub-projects/contexts, therefore show their tasks
    3. arborescence

      First sighting of word arborescence. I thought they were just doing that for fun, as a play on "tree", but I guess it's a real graph theory concept (https://en.wikipedia.org/wiki/Arborescence_(graph_theory)).

    1. Your task list is a plain text file, not some proprietary format owned by a company or locked to a specific application.
    2. A simple and timeless format Plain text is the simplest file format there is. It will always be accessible, by some kind of application, forever.
  6. plaintext-productivity.net plaintext-productivity.net
    1. Avoiding complicated outlining or mind-mapping software saves a bunch of mouse clicks or dreaming up complicated visualizations (it helps if you are a linear thinker).

      Hmm. I'm not sure I agree with this thought/sentiment (though it's hard to tell since it's an incomplete sentence). I think visualizations and mind-mapping software might be an even better way to go, in terms of efficiency of editing (since they are specialized for the task), enjoyment of use, etc.

      The main thing text files have going for them is flexibility, portability, client-neutrality, the ability to get started right now without researching and evaluating a zillion competing GUI app alternatives.

    2. Plaintext files are tiny, simple, quick to work with, editable by tons of great programs, searchable by all modern operating systems, easy to back up, perfect for versioning, trivial to sync between devices, and are amazingly flexible in their uses and formats.
    3. In this system, plaintext files are used for most of the backbone of your organizational system.
  7. burnsoftware.wordpress.com burnsoftware.wordpress.com
    1. made to work alongside the various plain-text, Dropbox syncing mobile notes apps such as Denote for Android and Jottings for iPhone from an app for the Ubuntu desktop. Plain text notes anywhere you want. Easily synced between your desktop and phone. Notes, plain and simple.
  8. burnsoftware.wordpress.com burnsoftware.wordpress.com
    1. Future proofs your journal entries by saving them as plain text and organizing them as you go. This means you can read or create entries when you don’t have DayJournal.
    1. Plain text is software and operating system agnostic. It's searchable, portable, lightweight, and easily manipulated. It's unstructured. It works when someone else's web server is down or your Outlook .PST file is corrupt. There's no exporting and importing, no databases or tags or flags or stars or prioritizing or insert company name here-induced rules on what you can and can't do with it.
    1. Countless productivity apps and sites store your tasks in their own proprietary database and file format. But you can work with your todo.txt file in every text editor ever made, regardless of operating system or vendor.
    1. greater integration of data, data security, and data sharing through the establishment of a searchable database.

      Would be great to connect these efforts with others who work on this from the data end, e.g. RDA as mentioned above.

      Also, the presentation at http://www.gfbr.global/wp-content/uploads/2018/12/PG4-Alpha-Ahmadou-Diallo.pptx states

      This data will be made available to the public and to scientific and humanitarian health communities to disseminate knowledge about the disease, support the expansion of research in West Africa, and improve patient care and future response to an outbreak.

      but the notion of public access is not clearly articulated in the present article.

    2. platform

      Does it have a name and online presence? The details provided here go beyond what's given in reference 13, but some more detail would still be useful, e.g. to connect the initiative to efforts directed at data management and curation more generally, for instance in the framework of the Research Data Alliance, https://www.rd-alliance.org/ .

    1. Practical highlights in my opinion:

      • It's important to know about data padding in PG.
      • Be conscious when modelling data tables about columns ordering, but don't be pure-school and do it in a best-effort basis.
      • Gains up to 25% in wasted storage are impressive but always keep in mind the scope of the system. For me, gains are not worth it in the short-term. Whenever a system grows, it is possible to migrate data to more storage-efficient tables but mind the operative burder.

      Here follows my own commands on trying the article points. I added - pg_column_size(row()) on each projection to have clear absolute sizes.

      -- How does row function work?
      
      SELECT pg_column_size(row()) AS empty,
             pg_column_size(row(0::SMALLINT)) AS byte2,
             pg_column_size(row(0::BIGINT)) AS byte8,
             pg_column_size(row(0::SMALLINT, 0::BIGINT)) AS byte16,
             pg_column_size(row(''::TEXT)) AS text0,
             pg_column_size(row('hola'::TEXT)) AS text4,
             0 AS term
      ;
      
      -- My own take on that
      
      SELECT pg_column_size(row()) AS empty,
             pg_column_size(row(uuid_generate_v4())) AS uuid_type,
             pg_column_size(row('hola mundo'::TEXT)) AS text_type,
             pg_column_size(row(uuid_generate_v4(), 'hola mundo'::TEXT)) AS uuid_text_type,
             pg_column_size(row('hola mundo'::TEXT, uuid_generate_v4())) AS text_uuid_type,
             0 AS term
      ;
      
      CREATE TABLE user_order (
        is_shipped    BOOLEAN NOT NULL DEFAULT false,
        user_id       BIGINT NOT NULL,
        order_total   NUMERIC NOT NULL,
        order_dt      TIMESTAMPTZ NOT NULL,
        order_type    SMALLINT NOT NULL,
        ship_dt       TIMESTAMPTZ,
        item_ct       INT NOT NULL,
        ship_cost     NUMERIC,
        receive_dt    TIMESTAMPTZ,
        tracking_cd   TEXT,
        id            BIGSERIAL PRIMARY KEY NOT NULL
      );
      
      SELECT a.attname, t.typname, t.typalign, t.typlen
        FROM pg_class c
        JOIN pg_attribute a ON (a.attrelid = c.oid)
        JOIN pg_type t ON (t.oid = a.atttypid)
       WHERE c.relname = 'user_order'
         AND a.attnum >= 0
       ORDER BY a.attnum;
      
      -- What is it about pg_class, pg_attribute and pg_type tables? For future investigation.
      
      -- SELECT sum(t.typlen)
      -- SELECT t.typlen
      SELECT a.attname, t.typname, t.typalign, t.typlen
        FROM pg_class c
        JOIN pg_attribute a ON (a.attrelid = c.oid)
        JOIN pg_type t ON (t.oid = a.atttypid)
       WHERE c.relname = 'user_order'
         AND a.attnum >= 0
       ORDER BY a.attnum
      ;
      
      -- Whoa! I need to master mocking data directly into db.
      
      INSERT INTO user_order (
          is_shipped, user_id, order_total, order_dt, order_type,
          ship_dt, item_ct, ship_cost, receive_dt, tracking_cd
      )
      SELECT true, 1000, 500.00, now() - INTERVAL '7 days',
             3, now() - INTERVAL '5 days', 10, 4.99,
             now() - INTERVAL '3 days', 'X5901324123479RROIENSTBKCV4'
        FROM generate_series(1, 1000000);
      
      -- New item to learn, pg_relation_size. 
      
      SELECT pg_relation_size('user_order') AS size_bytes,
             pg_size_pretty(pg_relation_size('user_order')) AS size_pretty;
      
      SELECT * FROM user_order LIMIT 1;
      
      SELECT pg_column_size(row(0::NUMERIC)) - pg_column_size(row()) AS zero_num,
             pg_column_size(row(1::NUMERIC)) - pg_column_size(row()) AS one_num,
             pg_column_size(row(9.9::NUMERIC)) - pg_column_size(row()) AS nine_point_nine_num,
             pg_column_size(row(1::INT2)) - pg_column_size(row()) AS int2,
             pg_column_size(row(1::INT4)) - pg_column_size(row()) AS int4,
             pg_column_size(row(1::INT2, 1::NUMERIC)) - pg_column_size(row()) AS int2_one_num,
             pg_column_size(row(1::INT4, 1::NUMERIC)) - pg_column_size(row()) AS int4_one_num,
             pg_column_size(row(1::NUMERIC, 1::INT4)) - pg_column_size(row()) AS one_num_int4,
             0 AS term
      ;
      
      SELECT pg_column_size(row(''::TEXT)) - pg_column_size(row()) AS empty_text,
             pg_column_size(row('a'::TEXT)) - pg_column_size(row()) AS len1_text,
             pg_column_size(row('abcd'::TEXT)) - pg_column_size(row()) AS len4_text,
             pg_column_size(row('abcde'::TEXT)) - pg_column_size(row()) AS len5_text,
             pg_column_size(row('abcdefgh'::TEXT)) - pg_column_size(row()) AS len8_text,
             pg_column_size(row('abcdefghi'::TEXT)) - pg_column_size(row()) AS len9_text,
             0 AS term
      ;
      
      SELECT pg_column_size(row(''::TEXT, 1::INT4)) - pg_column_size(row()) AS empty_text_int4,
             pg_column_size(row('a'::TEXT, 1::INT4)) - pg_column_size(row()) AS len1_text_int4,
             pg_column_size(row('abcd'::TEXT, 1::INT4)) - pg_column_size(row()) AS len4_text_int4,
             pg_column_size(row('abcde'::TEXT, 1::INT4)) - pg_column_size(row()) AS len5_text_int4,
             pg_column_size(row('abcdefgh'::TEXT, 1::INT4)) - pg_column_size(row()) AS len8_text_int4,
             pg_column_size(row('abcdefghi'::TEXT, 1::INT4)) - pg_column_size(row()) AS len9_text_int4,
             0 AS term
      ;
      
      SELECT pg_column_size(row(1::INT4, ''::TEXT)) - pg_column_size(row()) AS int4_empty_text,
             pg_column_size(row(1::INT4, 'a'::TEXT)) - pg_column_size(row()) AS int4_len1_text,
             pg_column_size(row(1::INT4, 'abcd'::TEXT)) - pg_column_size(row()) AS int4_len4_text,
             pg_column_size(row(1::INT4, 'abcde'::TEXT)) - pg_column_size(row()) AS int4_len5_text,
             pg_column_size(row(1::INT4, 'abcdefgh'::TEXT)) - pg_column_size(row()) AS int4_len8_text,
             pg_column_size(row(1::INT4, 'abcdefghi'::TEXT)) - pg_column_size(row()) AS int4_len9_text,
             0 AS term
      ;
      
      SELECT pg_column_size(row()) - pg_column_size(row()) AS empty_row,
             pg_column_size(row(''::TEXT)) - pg_column_size(row()) AS no_text,
             pg_column_size(row('a'::TEXT)) - pg_column_size(row()) AS min_text,
             pg_column_size(row(1::INT4, 'a'::TEXT)) - pg_column_size(row()) AS two_col,
             pg_column_size(row('a'::TEXT, 1::INT4)) - pg_column_size(row()) AS round4;
      
      SELECT pg_column_size(row()) - pg_column_size(row()) AS empty_row,
             pg_column_size(row(1::SMALLINT)) - pg_column_size(row()) AS int2,
             pg_column_size(row(1::INT)) - pg_column_size(row()) AS int4,
             pg_column_size(row(1::BIGINT)) - pg_column_size(row()) AS int8,
             pg_column_size(row(1::SMALLINT, 1::BIGINT)) - pg_column_size(row()) AS padded,
             pg_column_size(row(1::INT, 1::INT, 1::BIGINT)) - pg_column_size(row()) AS not_padded;
      
      SELECT a.attname, t.typname, t.typalign, t.typlen
        FROM pg_class c
        JOIN pg_attribute a ON (a.attrelid = c.oid)
        JOIN pg_type t ON (t.oid = a.atttypid)
       WHERE c.relname = 'user_order'
         AND a.attnum >= 0
       ORDER BY t.typlen DESC;
      
      DROP TABLE user_order;
      
      CREATE TABLE user_order (
        id            BIGSERIAL PRIMARY KEY NOT NULL,
        user_id       BIGINT NOT NULL,
        order_dt      TIMESTAMPTZ NOT NULL,
        ship_dt       TIMESTAMPTZ,
        receive_dt    TIMESTAMPTZ,
        item_ct       INT NOT NULL,
        order_type    SMALLINT NOT NULL,
        is_shipped    BOOLEAN NOT NULL DEFAULT false,
        order_total   NUMERIC NOT NULL,
        ship_cost     NUMERIC,
        tracking_cd   TEXT
      );
      
      -- And, what about other varying size types as JSONB?
      
      SELECT pg_column_size(row('{}'::JSONB)) - pg_column_size(row()) AS empty_jsonb,
             pg_column_size(row('{}'::JSONB, 0::INT4)) - pg_column_size(row()) AS empty_jsonb_int4,
             pg_column_size(row(0::INT4, '{}'::JSONB)) - pg_column_size(row()) AS int4_empty_jsonb,
             pg_column_size(row('{"a": 1}'::JSONB)) - pg_column_size(row()) AS basic_jsonb,
             pg_column_size(row('{"a": 1}'::JSONB, 0::INT4)) - pg_column_size(row()) AS basic_jsonb_int4,
             pg_column_size(row(0::INT4, '{"a": 1}'::JSONB)) - pg_column_size(row()) AS int4_basic_jsonb,
             0 AS term;
      
    1. Remarkably, studies receiving mainly public funding can, a quarter of a century on, still survive without making their data available in a useful way. In the UK a series of studies—the Avon Longitudinal Study of Parents and Children (ALSPAC) (100), UK Biobank (101), and Born in Bradford (102), among others—have surely been exemplary in promoting data accessibility.

      Critical points!

    1. view-helpers form-helpers form-helper view-helper button buttons form forms

      Since I didn't know which variant was canonical, I tagged with both/all variants. Gross.

    1. I'll give a little bit of the history to provide context. My own involvement in this started around 2008 after we had shipped our key-value store. My next project was to try to get a working Hadoop setup going, and move some of our recommendation processes there. Having little experience in this area, we naturally budgeted a few weeks for getting data in and out, and the rest of our time for implementing fancy prediction algorithms. So began a long slog. We originally planned to just scrape the data out of our existing Oracle data warehouse. The first discovery was that getting data out of Oracle quickly is something of a dark art. Worse, the data warehouse processing was not appropriate for the production batch processing we planned for Hadoop—much of the processing was non-reversable and specific to the reporting being done. We ended up avoiding the data warehouse and going directly to source databases and log files. Finally, we implemented another pipeline to load data into our key-value store for serving results. This mundane data copying ended up being one of the dominate items for the original development. Worse, any time there was a problem in any of the pipelines, the Hadoop system was largely useless—running fancy algorithms on bad data just produces more bad data. Although we had built things in a fairly generic way, each new data source required custom configuration to set up. It also proved to be the source of a huge number of errors and failures. The site features we had implemented on Hadoop became popular and we found ourselves with a long list of interested engineers. Each user had a list of systems they wanted integration with and a long list of new data feeds they wanted. ETL in Ancient Greece. Not much has changed.

      A great anecdote / story on the (pains) of data integration

    2. Effective use of data follows a kind of Maslow's hierarchy of needs. The base of the pyramid involves capturing all the relevant data, being able to put it together in an applicable processing environment (be that a fancy real-time query system or just text files and python scripts). This data needs to be modeled in a uniform way to make it easy to read and process. Once these basic needs of capturing data in a uniform way are taken care of it is reasonable to work on infrastructure to process this data in various ways—MapReduce, real-time query systems, etc. It's worth noting the obvious: without a reliable and complete data flow, a Hadoop cluster is little more than a very expensive and difficult to assemble space heater. Once data and processing are available, one can move concern on to more refined problems of good data models and consistent well understood semantics. Finally, concentration can shift to more sophisticated processing—better visualization, reporting, and algorithmic processing and prediction. In my experience, most organizations have huge holes in the base of this pyramid—they lack reliable complete data flow—but want to jump directly to advanced data modeling techniques. This is completely backwards. So the question is, how can we build reliable data flow throughout all the data systems in an organization?
    3. Data integration is making all the data an organization has available in all its services and systems.
  9. Nov 2019
    1. TrackMeNot is user-installed and user-managed, residing wholly on users' system and functions without the need for 3rd-party servers or services. Placing users in full control is an essential feature of TrackMeNot, whose purpose is to protect against the unilateral policies set by search companies in their handling of our personal information.
    1. Google has confirmed that it partnered with health heavyweight Ascension, a Catholic health care system based in St. Louis that operates across 21 states and the District of Columbia.

      What happened to 'thou shalt not steal'?

    1. Speaking with MIT Technology Review, Rohit Prasad, Alexa’s head scientist, has now revealed further details about where Alexa is headed next. The crux of the plan is for the voice assistant to move from passive to proactive interactions. Rather than wait for and respond to requests, Alexa will anticipate what the user might want. The idea is to turn Alexa into an omnipresent companion that actively shapes and orchestrates your life. This will require Alexa to get to know you better than ever before.

      This is some next-level onslaught.

  10. codeactsineducation.wordpress.com codeactsineducation.wordpress.com
    1. SEL measurement is being done in myriad ways, involving multiple different conceptualizations of SEL, different political positions, and different sectoral interests.

      Here I am reminded of the book Counting What Counts

    1. This booklet itells you how to use the R statistical software to carry out some simple analyses that are common in analysing time series data.

      what is time series?

    1. "While we hope that Google will lift these unwarranted sanctions for AdNauseam, it highlights a much more serious problem for Chrome users," the AdNauseam team adds. "It is frightening to think that at any moment Google can quietly make your extensions and data disappear, without so much as a warning."
  11. Oct 2019
    1. "Element" SelectorsEach component has a data-reach-* attribute on the underlying DOM element that you can think of as the "element" for the component.
    1. I frequently talk with people who are not that concerned about surveillance, or who feel that the positives outweigh the risks. Here, I want to share some important truths about surveillance: Surveillance can facilitate human rights abuses and even genocide Data is often used for different purposes than why it was collected Data often contains errors Surveillance typically operates with no accountability Surveillance changes our behavior Surveillance disproportionately impacts the marginalized Data privacy is a public good We don’t have to accept invasive surveillance

    Tags

    Annotators

    URL

    1. Terminar los proyectos que empezamos en 2019, con prioridad en Documentatón, ya que no es un cover, sino que es nuestro propio libro.

      Para mí el tema de acabarlo son recursos (tiempo y dinero, etc). Podemos ir avanzando de a trozos un capítulo a la vez, haciéndolo de encuentro en encuentro, pero esto daría un ritmo muy lento. La experiencia previa muestra que esto no es sostenible y que si queremos un libro terminado, más que un esfuerzo colectivo, se requerirá un alto esfuerzo individual. Como muestra la gráfica de reportes de la documentatón, una persona puede hacer más que la suma de las restantes (hablando de no temerle a la soledad):

      Sin embargo, si esto es lo que está pasando, reflejando las métricas de muchos proyectos de software libre, dependemos fuertemente de los tiempos de esos individuos. En mi caso, no puedo continuar con la documentatón hasta no resolver el tema de los artículos de mi graduación, que se volvió una verdadera telenovela (eso merece su entrada de blog aparte) y preferí asuntos como los Data Haiku, precisamente porque son actividades más puntuales que todo un libro, que luego pueden convertirse en capítulos de uno (por ejemplo el de Datactivismos) pero transmitiendo ese aire de lo ágil y de lo terminado, que precisamente quisiéramos comunicar.

      Creo que tenemos que reconocer en las dinámicas comunitarias, qué podemos hacer en ellas y en cuáles ritmos, y entender que lo otro requerirá de recursos extra (económicos, personales, etc) que tendremos que proveer como personas naturales o jurídicas, con nuestro propio esfuerzo o el de nuestras empresas/fundaciones.

    2. Selección comunitaria de temas para Data Weeks o Data Rodas. Apoyo de proyectos de los participantes de la comunidad. Reuniones periódicas de la Comunidad, algo así como Data Roda el primer viernes de cada mes, así sea para saludarnos síncronamente y ver en qué andamos y hacer un encera - brilla de bacanes.

      Estos tres puntos se podrían juntar con la idea de que los participantes propongan sus propios proyectos y se apropien de la planeación y ejecución de las Data Rodas o Data Weeks venideros.

      Sólo quitaría el carácter periódico, pues creo que una de las potencias de nuestra comunidad es responder flexiblemente a lo eventual. Por ejemplo, ahora tenemos un periodo electoral en Colombia. De allí surgió mi preocupación por visualizar financiación de campañas, pero los eventos de la semana pasada derivaron en blikis, con soporte de comentarios. Una reacción ágil a la contingencia y no el seguimiento riguroso de algo pre-planeado (a mi me gustaría retomar lo de financiación de campañas, pero será luego).

      De nuevo la sugerencia, como dije en mi entrada de respuesta a esta, y en otras ocasiones es sustituir la planeación por la coordinación. Mi propuesta de coordinación es la siguiente:

      • Los miembros que quieren ver otras temáticas las proponen en los canales comunitarios y se apersonan de su preparación y ejecución.
      • Los otros miembros respondemos a esas iniciativas autónomas, en solidaridad, acompañando esas sesiones y aportándoles.
      • Al final de cada evento, miramos hacia dónde podemos llevar los otros.
    3. Si puedo aportar una herramienta más a Grafoscopio, quiero que sea la querendura.

      Me parece muy potente la querendura como metodología. Sin embargo, por lo pronto siento que es un listado amplio de ideas sueltas en el enlace que nos presentas y me gustaría indagar por las prácticas concretas que la hacen posible.

  12. Sep 2019
    1. if n is very small (for example n = 3), rather than showing error bars and statistics, it is better to simply plot the individual data points.
    1. Keep the ergonomics of stable reference and directly mutable objects. In other words; be able to have a variable pointing to an object, and make subsequent reads or writes to it. Without needing to fear that you’re working with old data. While, in the background,..State is stored in an immutable, structurally shared tree.
    1. With MobX you don't need to normalize your data.

      flip side: https://codeburst.io/the-curious-case-of-mobx-state-tree-7b4e22d461f:

      MobX cannot guarantee your data is JSON serializable,

    1. Estimated economic benefit of data linkage

      the potential value from linking Census data to administrative data sets is only beginning to be realised and holds immense potential.(In other work for the Population Health Research Network, Lateral Economics concluded that data linkage generated over $16 for every dollar invested).

    2. Cost reduction suggestion

      there may be ways to reduce costs associated with the development of Census-equivalent statistics, including relying less on the general public to answer questions every five years

    1. “But then again,” a person who used information in this way might say, “it’s not like I would be deliberately discriminating against anyone. It’s just an unfortunate proxy variable for lack of privilege and proximity to state violence.

      In the current universe, Twitter also makes a number of predictions about users that could be used as proxy variables for economic and cultural characteristics. It can display things like your audience's net worth as well as indicators commonly linked to political orientation. Triangulating some of this data could allow for other forms of intended or unintended discrimination.

      I've already been able to view a wide range (possibly spurious) information about my own reading audience through these analytics. On September 9th, 2019, I started a Twitter account for my 19th Century Open Pedagogy project and began serializing installments of critical edition, The Woman in White: Grangerized. The @OPP19c Twitter account has 62 followers as of September 17th.

      Having followers means I have access to an audience analytics toolbar. Some of the account's followers are nineteenth-century studies or pedagogy organizations rather than individuals. Twitter tracks each account as an individual, however, and I was surprised to see some of the demographics Twitter broke them down into. (If you're one of these followers: thank you and sorry. I find this data a bit uncomfortable.)

      Within this dashboard, I have a "Consumer Buying Styles" display that identifies categories such as "quick and easy" "ethnic explorers" "value conscious" and "weight conscious." These categories strike me as equal parts confusing and problematic: (Link to image expansion)

      I have a "Marital Status" toolbar alleging that 52% of my audience is married and 49% single.

      I also have a "Home Ownership" chart. (I'm presuming that the Elizabeth Gaskell House Museum's Twitter is counted as an owner...)

      ....and more

    1. On the other hand, a resource may be generic in that as a concept it is well specified but not so specifically specified that it can only be represented by a single bit stream. In this case, other URIs may exist which identify a resource more specifically. These other URIs identify resources too, and there is a relationship of genericity between the generic and the relatively specific resource.

      I was not aware of this page when the Web Annotations WG was working through its specifications. The word "Specific Resource" used in the Web Annotations Data Model Specification always seemed adequate, but now I see that it was actually quite a good fit.

  13. Aug 2019
    1. Material Design Material System Introduction Material studies About our Material studies Basil Crane Fortnightly Owl Rally Reply Shrine Material Foundation Foundation overview Environment Surfaces Elevation Light and shadows Layout Understanding layout Pixel density Responsive layout grid Spacing methods Component behavior Applying density Navigation Understanding navigation Navigation transitions Search Color The color system Applying color to UI Color usage Text legibility Dark theme Typography The type system Understanding typography Language support Sound About sound Applying sound to UI Sound attributes Sound choreography Sound resources Iconography Product icons System icons Animated icons Shape About shape Shape and hierarchy Shape as expression Shape and motion Applying shape to UI Motion Understanding motion Speed Choreography Customization Interaction Gestures Selection States Material Guidelines Communication Confirmation & acknowledgement Data formats Data visualization Principles Types Selecting charts Style Behavior Dashboards Empty states Help & feedback Imagery Launch screen Onboarding Offline states Writing Guidelines overview Material Theming Overview Implementing your theme Components App bars: bottom App bars: top Backdrop Banners Bottom navigation Buttons Buttons: floating action button Cards Chips Data tables Dialogs Dividers Image lists Lists Menus Navigation drawer Pickers Progress indicators Selection controls Sheets: bottom Sheets: side Sliders Snackbars Tabs Text fields Tooltips Usability Accessibility Bidirectionality Platform guidance Android bars Android fingerprint Android haptics Android icons Android navigating between apps Android notifications Android permissions Android settings Android slices Android split-screen Android swipe to refresh Android text selection toolbar Android widget Cross-platform adaptation Data visualization Data visualization depicts information in graphical form. Contents Principles Types Selecting charts Style Behavior Dashboards Principles Data visualization is a form of communication that portrays dense and complex information in graphical form. The resulting visuals are designed to make it easy to compare data and use it to tell a story – both of which can help users in decision making. Data visualization can express data of varying types and sizes: from a few data points to large multivariate datasets. AccuratePrioritize data accuracy, clarity, and integrity, presenting information in a way that doesn’t distort it. HelpfulHelp users navigate data with context and affordances that emphasize exploration and comparison. ScalableAdapt visualizations for different device sizes, while anticipating user needs on data depth, complexity, and modality. Types Data visualization can be expressed in different forms. Charts are a common way of expressing data, as they depict different data varieties and allow data comparison.The type of chart you use depends primarily on two things: the data you want to communicate, and what you want to convey about that data. These guidelines provide descriptions of various different types of charts and their use cases.Types of chartsChange over time charts show data over a period of time, such as trends or comparisons across multiple categories. Common use cases include: Category comparison...Read MoreChange over timeChange over time charts show data over a period of time, such as trends or comparisons across multiple categories.Common use cases include: Stock price performanceHealth statisticsChronologies Change over time charts include:1. Line charts 2. Bar charts 3. Stacked bar charts 4. Candlestick charts 5. Area charts 6. Timelines 7. Horizon charts 8. Waterfall charts Category comparisonCategory comparison charts compare data between multiple distinct categories. Use cases include: Income across different countriesPopular venue timesTeam allocations Category comparison charts include: 1. Bar charts 2. Grouped bar charts 3. Bubble charts 4. Multi-line charts 5. Parallel coordinate charts 6. Bullet charts RankingRanking charts show an item’s position in an ordered list.Use cases include: Election resultsPerformance statistics Ranking charts include: 1. Ordered bar charts 2. Ordered column charts 3. Parallel coordinate charts Part-to-wholePart-to-whole charts show how partial elements add up to a total.Use cases include: Consolidated revenue of product categoriesBudgets Part-to-whole charts include: 1. Stacked bar charts 2. Pie charts 3. Donut charts 4. Stacked area charts 5. Treemap charts 6. Sunburst charts CorrelationCorrelation charts show correlation between two or more variables.Use cases include: Income and life expectancy Correlation charts include: 1. Scatterplot charts 2. Bubble charts 3. Column and line charts 4. Heatmap charts DistributionDistribution charts show how often each values occur in a dataset. Use cases include: Population distributionIncome distribution Distribution charts include: 1. Histogram charts 2. Box plot charts 3. Violin charts 4. Density charts FlowFlow charts show movement of data between multiple states.Use cases include: Fund transfersVote counts and election results Flow charts include: 1. Sankey charts 2. Gantt charts 3. Chord charts 4. Network charts RelationshipRelationship charts show how multiple items relate to one other.Use cases includeSocial networksWord charts Relationship charts include: 1. Network charts 2. Venn diagrams 3. Chord charts 4. Sunburst charts Selecting charts Multiple types of charts can be suitable for depicting data. The guidelines below provide insight into how to choose one chart over another. Showing change over timeChange over time can be expressed using a time series chart, which is a chart that represents data points in chronological order. Charts that express...Read MoreChange over time can be expressed using a time series chart, which is a chart that represents data points in chronological order. Charts that express change over time include: line charts, bar charts, and area charts.Type of chartUsageBaseline value * Quantity of time seriesData typeLine chartTo express minor variations in dataAny valueAny time series (works well for charts with 8 or more time series)ContinuousBar chartTo express larger variations in data, how individual data points relate to a whole, comparisons, and rankingZero4 or fewerDiscrete or categoricalArea chartTo summarize relationships between datasets, how individual data points relate to a wholeZero (when there’s more than one series)8 or fewerContinuous* The baseline value is the starting value on the y-axis.Bar and pie chartsBoth bar charts and pie charts can be used to show proportion, which expresses a partial value in comparison to a total value. Bar charts,...Read MoreBoth bar charts and pie charts can be used to show proportion, which expresses a partial value in comparison to a total value. Bar charts express quantities through a bar’s length, using a common baselinePie charts express portions of a whole, using arcs or angles within a circleBar charts, line charts, and stacked area charts are more effective at showing change over time than pie charts. Because all three of these charts share the same baseline of possible values, it’s easier to compare value differences based on bar length. Do.Use bar charts to show changes over time or differences between categories. Don’t.Don’t use multiple pie charts to show changes over time. It’s difficult to compare the difference in size across each slice of the pie. Area chartsArea charts come in several varieties, including stacked area charts and overlapped area charts: Overlapping area charts are not recommended with more than two time...Read MoreArea charts come in several varieties, including stacked area charts and overlapped area charts:Stacked area charts show multiple time series (over the same time period) stacked on top of one another Overlapped area charts show multiple time series (over the same time period) overlapping one anotherOverlapping area charts are not recommended with more than two time series, as doing so can obscure the data. Instead, use a stacked area chart to compare multiple values over a time interval (with time represented on the horizontal axis). Do.Use a stacked area chart to represent multiple time series and maintain a good level of legibility. Don’t.Don’t use overlapped area charts as it obscures data values and reduces readability. Style Data visualizations use custom styles and shapes to make data easier to understand at a glance, in ways that suit the user’s needs and context.Charts can benefit from customizing the following: Graphical elementsTypographyIconographyAxes and labelsLegends and annotationsStyling different types of dataVisual encoding is the process of translating data into visual form. Unique graphical attributes can be applied to both quantitative data (such as temperature, price,...Read MoreVisual encoding is the process of translating data into visual form. Unique graphical attributes can be applied to both quantitative data (such as temperature, price, or speed) and qualitative data (such as categories, flavors, or expressions). These attributes include:ShapeColorSizeAreaVolumeLengthAnglePosition DirectionDensityExpressing multiple attributesMultiple visual treatments can be applied to more than one aspect of a data point. For example, a bar color can represent a category, while a bar’s length can express a value (like population size). Shape can be used to represent qualitative data. In this chart, each category is represented by a specific shape (circles, squares, and triangles), which makes it easy to compare data both within a specific range or against other categories. ShapeCharts can use shapes to display data in a range of ways. A shape can be styled as playful and curvilinear, or precise and high-fidelity,...Read MoreCharts can use shapes to display data in a range of ways. A shape can be styled as playful and curvilinear, or precise and high-fidelity, among other ways in between. Level of shape detailCharts can represent data at varying levels of precision. Data intended for close exploration should be represented by shapes that are suitable for interaction (in terms of touch target size and related