39 Matching Annotations
  1. May 2022
    1. The process of building automated decision-making systemsis riddled with biases in two major areas: in the selectionof features and within the data.

      To improve the quality of selected features does not conflict with cleaning the original data, while dealing with features seems more technique and the original data is related to some complex social status problem.

    2. Racial and gender discrimination are entrenched in societyand these biases are reproduced in the digital realm.13

      To eliminate the discrimination from the digital realm sound reasonable, but if there's no efficient techniques, why don't we come back to make a better world.

    1. The idea that artificial neural network architecture (and with it, “deep learning”) is the breakthrough technology for creating conscious, or even sentient, machines fuels the looming fear of robots taking our jobs. It prompts us to picture the Terminator, rather than a server farm, in our head.

      The expectation on AI seems ideal but to make it possible, there's still some gap and people should feel free to change their long-term plan with the specific development of technology.

    1. ‘The social’ in data – As a basis for understand-ing inequality in algorithmic technology design,we must ask broader questions around how dif-ferent notions of ‘the social’ get classified

      Translating the general social language to specific ideas or directions with algorithms is also a challenge.

    2. Computer science students must be equippedwith the conceptual tools they need to reflex-ively locate themselves, and their practice, in thesocial world.

      Not only computer science students, but students studying in different domain should be equipped with these tools since people from different areas will collaborate in a company to achieve some big tasks.

  2. data-ethics.jonreeve.com data-ethics.jonreeve.com
    1. Too often, Big Data enables the practice of apophenia: seeing patterns wherenone actually exist, simply because enormous quantities of data can offer connec-tions that radiate in all directions. In one notable example, Leinweber (2007)demonstrated that data mining techniques could show a strong but spurious cor-relation between the changes in the S&P 500 stock index and butter productionin Bangladesh.

      Beyond what is shown by big data, people have to do their own critical thinking to verify if the new findings are valid.

  3. Apr 2022
    1. Only Appriss knows exactly how this score is derived, but according to the company’s promotional material, its predictive model not only draws from state drug registry data, but “may include medical claims data, electronic health records, EMS data, and criminal justice data.”

      This kind of ML system may meet the obstacle of interpretability if it is designed to be complicated and it's hard to determine a standard rule to calculate the risk score.

    2. It notes the number of pharmacies a patient has visited, the distances she’s traveled to receive health care, and the combinations of prescriptions she receives. 

      More supplementary training may be needed to help doctors to acquire information and make decisions based on the records.

    1. Consumers typically do not have the right to opt out of being the subject of a consumer score or to prevent use of a consumer score.

      Consumers can not isolate themselves from buying products from the companies thus they have no way but to take the consumer score, which allows the companies to keep their status.

    2. Another area of concern is the factors used in new consumer scores, which may include readily commercially available information about race, ethnicity, religion, gender, marital status, and consumer-reported health information.

      Those privacy information should be carefully regulated by law to avoid discrimination from those aspects.

    1. Despite the widespread beliefs in the Internet as a democratic spacewhere people have the power to dynamically participate as equals, theInternet is in fact organized to the benefit of powerful elites,51 includingcorporations that can afford to purchase and redirect searches to theirown sites.

      Sometimes what people see or get from the internet is determined by the big corporations or advertisement companies, which means people are gradually shaped or dominated by the Internet or those recommendation algorithms. Unfortunately, most people have no choice but to continue using the Internet.

    2. Halavais suggests that everyuser of a search engine should know how the system works, how infor-mation is collected, aggregated, and accessed. To achieve this vision, thepublic would have to have a high degree of computer programming lit-eracy to engage deeply in the design and output of search.

      Those engine companies should provide more details or explanations for the public to interprete their searching algorithms so that people can be aware of what is happening to their search and have a better recognization of the results.

    1. For researchers, the only legally safe and technicallyfeasible way to obtain data of search results is to usea third-party scraper.

      This restrictions tends to prevent researches from making unbiased evaluation for the product of tech companies and may facilitate their monopoly.

    2. This world map shows top image results for searches for “Tiananmen Square” made in most Google-supported countries, overlaid on theapproximate geographic location of each country. Most countries show photographs of the 1989 protests, except for China and a few surroundingcountries, which show touristic and promotional images.

      These different results may affect people's understanding of an unknown or unfamiliar subject, which means the knowledge that people get from google is determined by ideology and politics.

    1. Inpractical terms, researchers use two main approaches to privacy concerns with geocodeddata: first, various methods to blur or restrict the data available on a particular query(k-anonymity), and second, introducing tracking uncertainty by pruning data in order toreduce ‘time to confusion’ – the length of time an adversary can accurately track anindividual

      These kind of methods may harm the quality of the dataset with respect to research since researchers want the most detailed dataset with more information. This kind of trade-off exactly protects people's privacy since we can not control the capital unless implementing restrict regulations in advance.

    2. If,however, there is access to the location data being sent by the phone itself, the potential totrack people is limited only by the phone’s access to the network – and by the researcher’sability to understand how the data may represent what is happening on the ground.

      Scientists have to pay attention to the uncertainty caused by the phone's access to the network since if they do not know accurate connection situation, the behavior of people may be misunderstood.

    1. No identity, whether racial or ethnic, exists outside of a social interaction, and it follows as a matter of course that one of the purposes of this study, in recontextualizing the categories used by the census, is to contribute to the weakening of the essentialist inter-pretations of racial categories, for which the census was able to supply content.

      Both racial and ethnic comes from the society and we can not interpret either of them separately, and the census tries to help people construct their racial categories and facilitates the society to better understand the current situation.

    2. This work is guided by the idea that the population does not preexist the census, that the census participates in the production of the national commun-ity defined by the inclusion of some and the exclusion of others.3

      What the population looks like depends on the way people design the census and that's why people care about the classifying methods used in census during different period. The results of census may have profound influence on the development of the society.

    1. This conversa-tion is remarkable for the clarity with which the Census director articulates race and class, opposing the “wealthy class of Mexicans” and the “peon class” as to the different racial self- identification he believes them to practice.

      The official definition can hardly make everyone satisfied since people with different wealthy class and society status may have different self-identification and they both feel classification as Mexican will harm their daily life.

    2. The problem for the census was more complex than in the case of Asian immigrants, because it was understood that some persons of Mexican origin were white, and it was necessary to remove them from the Mexican racial category.

      The criterion becomes more complicated and more indicators are being added for the race classification. This kind of method can not be seen as logically perfect but it is somehow subjective. However, this subjectivity helps to classify people into different races and works well with time going.

  4. Mar 2022
    1. Google has painted itself as a company dedicated to “ethical” A.I. But it is often reluctant to publicly acknowledge flaws in its own systems.

      Google’s main focus may be the profit, but as the company has made great achievements in some breaking research, they are also responsible to solve possible flaws in their product.

    2. Researchers worry that the people who are building artificial intelligence systems may be building their own biases into the technology. Over the past several years, several public experiments have shown that the systems often interact differently with people of color — perhaps because they are underrepresented among the developers who create those systems.

      Maintaining the diversity of the staff in the development team may helps to reduce the bias of people speak for the rights of different races.

    1. Starting with who is contributing to these Internet text collec-tions, we see that Internet access itself is not evenly distributed,resulting in Internet data overrepresenting younger users and thosefrom developed countries [100, 143].

      Even if people are contributing equally to the Internet, people may feel uncomfortable when they see the NLP models give them something they are not familiar with since their own social circle may not be equally distributed since they usually interact more with their peers.

    2. Significant time should be spent on assembling datasets suitedfor the tasks at hand rather than ingesting massive amounts of datafrom convenient or easily-scraped Internet sources.

      Can we prevent AI system to generate texts related to the ugliness and cruelty of the real world by adding some correction mechanism into the system instead of the data input and try to tell AI systems to discard those information they have been fed?

    1. This ensures that future economic opportunities always belong to the communities from which the data was gathered.

      It's hard to guarantee this hope since this is different from the globalization trend and as Maori people balance the pros and cons of giving the access of their language data to the public they may change their decision.

    2. The tools for building speech-to-text systems – which allow Te Hiku to transcribe their radio content – and other speech recognition technology are fairly accessible, such as Mozilla’s open-source tool Deep Speech.

      Probably can use some tuning technique or modification of the speech-to-text system based on the specific language like Maori to make the pre-training faster and less complex.

    1. It is important to recognize that some minority language communities contain groups that propose violent means to achieve separatist or supremacist aims, and surveillance of their activities is essential to saving lives.

      Wondering if this kind of surveillance will easily fall to the region of discrimination since we may have to focus on those groups that have a high tendency to propose violent ideas or have the records to implement some violent activities, although this kind of surveillance saves life.

    2. This is of particular import for us to consider, as the minority and Indigenous languages that are least supported digitally are also disproportionately at odds with national governments and corporate powers, sometimes by virtue of their very existence.

      With the facility of digital systems or technologies, comes the surveillance. Those community can not determine the direction of the development of digital issues and they may brace the update for digital tools at the beginning but found it much restricted when they noticed the surveillance.

    1. In any context where an automated decision-making system must allocate resources or punishments among multiple groups that have different outcomes, different definitions of fairness will inevitably turn out to be mutually exclusive.

      This problem is beyond the codes or algorithms, actually it involves some Philosophy thinking. For different people, fairness would be different and thus it's hard to evaluate a system from conventional scientific view.

    2. It trains on historical defendant data to find correlations between factors like someone’s age and history with the criminal legal system, and whether or not the person was rearrested. It then uses the correlations to predict the likelihood that a defendant will be arrested for a new crime during the trial-waiting period.1

      Relative simple methods using correlations will not lose interpretability so that the developers of COMPAS can give a brief but not sufficient explanation to the decisions they make.

    1. At the most basic level, wages need to be required to rise in tandem with productivity — especially when it comes to “low-skill” work that keeps the most crucial parts of our economy afloat.

      Although “low-skill” work seems to be replaceable, but those work opportunity means more than the efficiency can the cost — they are high related to life of people.

    1. But consider also that the “black box” nature ofautomated hiring affords opportunities for a villainousemployer to engage in what I term “data-laundering”—that is to use big data and its concomitant algorithmicprocesses in such a way as to achieve discriminatoryresults while maintain an appearance of impartiality

      It’s hard to determine whether automated hiring with “black box” nature is fairer than human beings since people also can secretly let their discrimination to determine the results without being noticed.

    2. The folly in this oracular reliance on big data-drivenalgorithmic systems is that without proper interpreta-tion, the decision-making of algorithmic systems coulddevolve to apophenia, which results in “seeing patternswhere none actually exist, simply because enormousquantities of data can offer connections that radiate inall directions”

      At this period, it’s impossible for data-driven algorithmic systems to achieve genuine intelligence since they can not always lead to the correct results. Human-beings have to make proper interpretation or correction to assist those systems.

  5. Feb 2022
    1. Whether deciding which teacher to hire or fire or which loan applicant to approveor decline, automated systems are alluring because they seem to remove the burden from gatekeepers,who may be too overworked or too biased to make sound judgments.

      It seems that the automated systems can relieve the dilemma that people tend to be biased. However, when some biased decision is made by machine, the people who designed the algorithm may also be blamed since we human determine all the automated systems.

    2. Once someone is added to the database, whether they know they are listed , they undergo evenor notmore surveillance and lose a number of rights.1

      As long as the database is created, people can not avoid their name being added into the database and compared with historical data. Technology increases the chance that a name is linked to bad history and will worsen the discrimination between names and harms people's rights.

    1. A role on Google’s AI board was an unpaid, toothless position that cannot possibly, in four meetings over the course of a year, arrive at a clear understanding of everything Google is doing, let alone offer nuanced guidance on it.

      Google didn’t attach importance to the real progress and benefits of Google’s AI board and other work concerning AI ethics. However, they just pretend to be focusing on this but setting impossible goals, although we have to admit that AI ethics is not a easy thing to deal with.

    2. AI capabilities are continuing to advance, leaving most Americans nervous about everything from automation to data privacy to catastrophic accidents with advanced AI systems.

      It’s time to relieve the panic of people concerning the uncertainty of AI. Big tech companies have their responsibilities to explain the complex concepts more clearly to not only the researchers or students, but also common people, which could prevent people from standing on the other side.

  6. data-ethics.jonreeve.com data-ethics.jonreeve.com
    1. they are also the ones who get to determine the rules

      Probably needs negotiation with some other people, like the government or some specific administrations to integrate resources from different area. I don’t believe it is appropriate to let the rules be determined only by those who analyze the big data.

    2. and theseerrors and gaps are magnified when multiple data sets are used together.

      I am wondering whether we can make the quality of the data sets better as we acquire multiple data sets as we can do the cleaning part and extract useful informations as well as correcting the bias and errors.