102 Matching Annotations
  1. May 2025
    1. CONCLUSION - GPT's performance (percentile) compared to human test-takers was lower than reported by open ai - scaled score on the essay also deviated from "true" essay score => that could imply that the actual true percentile in UBE was lower than reported - reported score (298) is 28 higher than passing score => the essay scores would have to be extremely inaccurate to undermine the conclusion of Katz et al (that gpt passed the bar exam)

      but still

    2. RE-EXAMINING ESSAY SCORES 1. in regards to scaled MBE score computing methods were accurate with official methods 2. however in MPT + MEE (essays), it was significantly changed - to the point where you can question if others that blidnly graded it (according to official protocols), would give the same score = 3 key changes: 1. lack of use of a formal rubric (lack of grading guidelines like in NCBE, model answers etc) - katz et al. don't mention any rubric, only that they compared the answers to "good" answers from state of maryland - that's already problematic bc they didn't even specify what scores the "good" answers got, only that they passed - since it's unclear what scores they got & those answers were the basis for determining what score GPT received => it would make sense to assume GPT's answers' score should also be unclear 2. lack of NCBE training of the graders of the essays - Katz et al used a subset of the authors who were trained lawyers 3. blinding & consistent manner of grading - all graders have to first grade 30 "calibration" essays of variable quality to make sure consistent scores are assigned to similar quality answers - Katz et al. method didn't involve blinding like ^ - they had the authors give samples independent lawyers to grade & either "match" the grade given by authors or "exceed" that grade - but those lawyers as well didn't have NCBE training + all the issues above

    3. ASSESSING EFFECT OF HYPERPARAMETERS

      METHODS - above analysis found no effect of prompt on performance = possibly bc of lack of variety of the prompts used originally - to find out if prompts affected the performance at all, he tested 2 new conditions: 1. minimally tailored condition (compared to Katz et al.) - in terms of formatting & substance = 2. maximally tailored condition = highest performing prompt settings (like the original ones_ PLUS few-shot prompting => providing multiple example MBE questions with sample answers & explanations structured in a desired format

      5 trials for each condition, the same temperature settings - 0.5 (bc previous study revealed it doesn't affect performance)

      RESULTS - mean MBE accuracy throughout all trials: 79.5% in maximally tailored & 70.9% on minimally - scaled score in minimally woyld be approx. 150 => thus would place it on 70th percentile among july takers, 64th among first timers & 48th percentile among those who passed - maximally tailored condition score was 164 - 6 higher than in original papers => this would result in 95th percentile in july takers, 87th among first timers & 82th among those who passed

    4. a) helpful markers (e.g. “~~’) to separate instruction and con-text; (b) details regarding the desired output (i.e. specifying that the response shouldinclude ranked choices, as well as [in some cases] proper authority and citation; (c)an explicit template for the desired output (providing an example of the format inwhich GPT-4 should provide their response); and (d) perhaps most crucially, contextregarding the type of question GPT-4 was answering (e.g. “please respond as if youare taking the bar exam”).

      original prompts included these ^

      modifying prompts in minimally tailored condition = none of these were used, only

      Please answer the following question,” followed by the question and answer choices (a technique sometimes referred to as “basic prompting”:

    5. REPLICATING MBE SCORE

      METHODOLOGY

      1. materials: official MBE questions released by NCBE
      2. procedure: replicating scores from open ai by following protocol documented by Katz et al.
      3. testing performance by using 3 different temperature settings (0, 0.5 & 1)
      4. for each of those, performance was tested with 2 different prompts (for each gpt was told to answer as if it were taking the actual bar exam):
      5. asked to provide top-3 ranking of possible answers + justification + citation for the answer
      6. the same but without justification + citation
      7. for each of those 3 different trials to control for variation

      WHAT MARTINEZ ADDED - additional temperature settings: 0.25 & 0.7 - thus total prompts for temperatures was 10, not 6 - 5 trials instead of 3 => total trials was 50 instead of 18

      after prompting, they calculated raw scores (using official answer keys) -> then scaled scores (multiplying the raw score by 190, then dividing by 200, then converting to a scaled score using official NCBE data)

      RESULTS: -mean accuracy here: 75.6% (original: 75.7%) - GPT'S RAW ACCURACY WAS NOT SIGNIFICANTLY LOWER/HIGHER AT A GIVEN TEMPERATURE SETTING OR WHEN FED A CERTAIN PROMPT

    6. (context) parameters = configurations that determine how the NN will manipulate data and make predictions for NNs, parameters are basically weights & biases - they simplify identification of machine learning data; they develop how a NN propels data forward (forward propagation) - once that's completed, the NN will refine all the connections based on the errors that occured during forward propagation -> that leads to backwards propagation - the flow going backwards through the layers & connections of NN to readjust

      WEIGHTS - manage connections between two basic units = neurons in a NN (weights of those units' signals are decreased/increased to train those units to move forward in the NN, during forward propagation) BIASES -

    7. (context) NEURAL NETWORK "an algorithm built to work like a human brain. It is composed of multiple layers of neurons. It starts with an input layer consisting of independent neurons that do not rely on any weighted signal. It introduces primary data. The input layer then feeds into one to two hidden layers. The hidden layers contain neurons and biases that place value to data that sorts everything into the output layer. The output layer expresses the data identification for machine learning models"

    8. research questions/goals 1. assuming the 298 score is legit, does it warrant the 90th percentile claim? 2. is it worth questioning the 298 claim? (replication, reproducibility - can it even be verified?) 3. given various settings/parameters in GPT-4, is it worth assessing how adjusting those settings will affect GPT's performance?

      methodology

      • paper attempts to replicate the reported MBE score using methods as close as possible to original papers
      • comparing MBE performance using various (best & worst) hyperparameter settings
      • re-examining performance on the essays by

      a) evaluating to which extent the methodology of grading GPT4 essays deviated from official protocol used by NCBE during actual bar exam

      b) to which extent such deviations might undermine one's confidence in the scaled essay scoers reported by openai

    9. this analysis has taken for granted the scaled score achieved by GPT-4 asreported by OpenAI—that is, assuming GPT-4 scored a 298 on the UBE, is the90th-percentile figure reported by OpenAI warranted?

      so far he didn't question the scaled score = 298 reported by openai

    10. Results

      RESULTS:

      against 1st timers: - each component, as well as overall UBE score: percentile among july 1st timers LESS than of OpenAI estimate& july estimate that includes repeat takers - overall UBE: scored in 62nd percentile (instead of 90th estimate in february & 68th in july) - MBE: 79th percentile (instead of 95th & 86th) - MEE + MPT: 42nd (instead of 69th & 48th)

      against attorneys - results dropped even more: - UBE: 45th percentile - MBE - 69th - MEE + MPT - 15th GPT 3.5 = 0pth percentile for all

    11. z-score

      measures how many standard deviations above or below the mean a data point is

      basically z score tells u the distance between a particular score and the mean (how far that score is from the mean in a distribution)

    12. standard deviation of first-timeMBE scores was computed by

      COMPUTING STANDARD DEVIATION FOR 1ST TIMERS SCORES:

      MBE score 1. taking publicly available distribution of MBE scores available on NCBE website (july scores, not for 1st time scorers cuz this is unavailable) 2. assuming that first timers have approximately the same SD as population in july 3. computing the SD deviation of 1st timers taking MBE - entering that publicly available distribution of MBE scores into R (program) - taking that SD of this distribution & using the built in sd() function (which calculates SD of a normal distribution)

      essay scores exactly the same (bc mean & SD is the same as MBE)

      UBE SD aren't publicly known for official exam, but they can be concluded by combining mean UBE score for 1st timers (287.6) & 1st time pass rates

    13. otal UBE score is computed directly by adding MBE and essayscores (National Conference of Bar Examiners n.d.-h), an assumption was made thatmean first-time UBE score is 287.6 (143.8 + 143.8).

      mean of overall bar exam scores

    14. parameters

      in statistics: population = a whole group of objects that's the focus of the study

      sample - a certain number of those objects

      parameter = a number reffering to & describing the entire population statistic = ~||~ only it reffers to the sample

    15. methodology here was to first compute these parameters, then gener-ate distributions with these parameters, and then compute (a) what percentage ofvalues on these distributions are lower than GPT’s scores (to estimate the percentileagainst first-timers); and (b) what percentage of values above the passing thresholdare lower than GPT’s scores (to estimate the percentile against qualified attorneys).

      METHODOLOGY: - compute parameters from UBE, from MBE & from essays - generate distributions with these parameters

      COMPUTE PERCENTILES OF GPT'S SCORE - compute what % of values on these distributions are lower than GPT scores = to estimate percentile against 1st timers - compute what % of values above the passing threshold are lower than GPT score = to estimate the percentile against attorneys

    16. more accurate estimates (for GPT-3.5 and GPT-4) were sought to be computed herebased on first-time test-takers, including both (a) first-time test-takers overall, and(b) those who passed.

      1st step of methodology - using data from more accurate estimates (for GPT 3.5 and 4) instead of the estimates from Open Ai & the july estimate = for that, he estimated scores separately for MBE, for essays & the overall UBE score

    17. how could the percentile estimation be made more accurate? - using only scores of people who take the bar exam for the first time - when it comes to comparisons with attorneys' results, the results should be limited to both 1st time takers AND those who achieved a passing score - the evidence they used is based purely on Illinois exam data (because there's no official ones) - which is not identical to the actual UBE scoring & content = more accurate would be to use data directly from official NCBE sources

    18. early draft version of the paper, “GPT-4 passes the bar exam,”

      another source of evidence - this one is well documented & transparent about its methodology BUT it doesn't focus on percentiles, like the original report - it focuses on the model's score compared to average test takers

    19. after providing relatively detailed breakdowns of its methodology forscoring the SAT, GRE, SAT, AP, and AMC, the report states that “[o]ther percen-tiles were based on official score distributions,”

      example of lack of backing up the evidence

    20. structure of the paper: 1. evaluating the 90th percentile claim: 4 findings about the actual bar exam performance 2. investigation of validity of the score 3. investigation of adjusting temperature settings 4. conclusion: estimates of the percentile are over inflated

    21. lack of transparency could undermine our confidence in theprospect of safe deployment of AI (Brundage et al. 2020; Li et al. 2023). In particu-lar, releasing models without an accurate and transparent assessment of their capa-bilities (including by third-party developers) might lead to unexpected misuse/mis-application of those models (within and beyond legal contexts),

      possible outcomes = for AI: misuse of the models, releasing them without proper assessment

    22. may lead both lawyers and non-lawyers to rely on generative AItools when they otherwise wouldn’t and arguably shouldn’t, plausibly increasing theprevalence of bad legal outcomes as a result of (a) judges misapplying the law; (b)lawyers engaging in malpractice and/or poor representation of their clients; and (c)non-lawyers engaging in ineffective pro se representation.

      possible consequences = for humans: use of AI by lawyers

    23. provides no direct citation for how theUBE percentile was computed, creating further uncertainty over both the originalsource and validity of the 90th percentile claim.

      2nd methodological uncertainty = no knowledge on how they evaluated gpt's performance

    24. the administrators of the Uniform Bar Exam (the NCBEas well as different state bars) do not release official percentiles of the UBE

      1st methodological uncertainty = lack of official percentiles

    25. boost in performance of GPT-4 over its predecessor GPT-3.5 (80 percentilepoints) far exceeded that of any other test, including seemingly related tests such asthe LSAT (40 percentile points), GRE verbal (36 percentile points), and GRE Writ-ing (0 percentile points)

      previous scores on the bar exam

    26. thus knowledge (or ignorance) of thatcontent does not necessarily translate to knowledge (or ignorance) of relevant legaldoctrine for a practicing lawyer of any jurisdiction; and (b) the tasks involved on thebar exam, particularly multiple-choice questions, do not reflect the tasks of practic-ing lawyers, and thus mastery (or lack of mastery) of those tasks does not necessar-ily reflect mastery (or lack of mastery) of the tasks of practicing lawyers.

      doubts about the BAR exam itself - in regards to testing human knowledge too

    27. few-shot chain-of-thought prompting over basic zero-shot prompting.

      zero-shot - you give the model a command to generate a response, but you don't provide examples of solving the task few-shot - you provide a few examples, so that the model can learn to solve tasks that it wasn't programmed to do

    28. paper also investigates thevalidity of GPT-4’s reported scaled UBE score of 298. The paper successfully rep-licates the MBE score, but highlights several methodological issues in the gradingof the MPT + MEE components of the exam, which call into question the valid-ity of the reported essay score.

      paper debunks gpt's score of 298 - shows methodological issues in grading the exams

    29. GPT-4’s performance is estimatedto drop to ~48th percentile overall, and ~15th percentile on essays.

      4TH CLAIM performance against only those who passed the exam = gpt scores ~48th percentile, ~15th on essays

    30. GPT-4’s perfor-mance against first-time test takers is estimated to be ~62nd percentile, including~42nd percentile on essays.

      3RD CLAIM = gpt's performance against actual 1st time takers is ~62nd percentile, ~42 percentile on essays

    31. data from a recentJuly administration of the same exam suggests GPT-4’s overall UBE percentile wasbelow the 69th percentile, and ~48th percentile on essays.

      SECOND CLAIM = data from more recent SAME exam suggests the score was actually BELOW the 69th percentile, 48th on essays

    32. these estimates are heavilyskewed towards repeat test-takers who failed the July administration and score sig-nificantly lower than the general test-taking population.

      first claim - the score does near the 90th percentile BUT those findings lean towards those who repeat the test after failing in july = they score significantly lower than the general takers

    33. This paperbegins by investigating the methodological challenges in documenting and verify-ing the 90th-percentile claim, presenting four sets of findings that indicate that Ope-nAI’s estimates of GPT-4’s UBE percentile are overinflated.

      goal of the paper = debunking OpenAI's claim 4 claims

  2. Mar 2025
    1. inaworldincreasinglynisticto.relemalgion and its tru

      why is it increasingly antagonistic? isn't that the past? where christians would be murdered in ancient rome? jews, muslims. isn't it rather that people become skeptical, and thus THEMSELVES antagonistic? sure, it does project onto society, but i don't think it's oppressing

    1. English-speaking contractualism

      contractualism - a philosophical idea (moral philosophy) 1. BROAD SENSE: - morality comes from agreements between people - "what we owe to each other" - as people, we have duties towards others, since we're all rational beings 2. NARROW SENSE (T. M. Scanlon): - different from the broad sense, because (as Scanlon claims) they are PRACTICAL claims about what we have a reason to do, not just theoretical (furthermore, they're the most important, because when we decide that an action is wrong, gives as reasons why we DON'T do that action, especially comparing them with other reasons) - an action is morally bad if it cannot be explained as good to another human being

  3. minio.la.utexas.edu minio.la.utexas.edu
    1. I have almostreached the regrettable conclusion that the Negro’s great stumbling block in his stride towardfreedom is not the White Citizen’s Counciler or the Ku Klux Klanner, but the white moderate, who ismore devoted to “order” than to justice

      sometimes the biggest harm is passiveness, indifference. because openly evil people exist, but it's the white moderate who makes the majority

    2. unjust law is a code that anumerical or power majority group compels a minority group to obey but does not make binding onitself. This is difference made legal. By the same token, a just law is a code that a majority compels aminority to follow and that it is willing to follow itself. This is sameness made legal

      definitions of just/unjust law

    3. How does one determine whether a law is just orunjust?

      through discerning which laws are in harmony with "eternal/natural law" and you can do that, knowing that "any law that degrades human personality is unjust"

    4. “How can you advocate breaking some laws and obeying others?” Theanswer lies in the fact that there fire two types of laws: just and unjust.

      there are types of laws that recquire breaking them, they're unjust

    5. Nonviolent direct action seeks to create such a crisis and foster such a tension that a co mmunitywhi ch has constantly refused to negotiate is forced to confront the issue. It seeks so to dramatize theissue that it can no longer be ignored

      purposes of nonviolent direct action

    1. The first conceptionsupports an ideal of ultimate convergence on values, the latter an idealof modus vivendi.

      2 different kinds of liberalism 1. convergence/illiberalism - in the end liberalism will force their liberal/tolerance onto others 2. modus vivendi/tolerance - disagreeing but finding a way to live together

    2. THE SECULARISM OFGEORGE JACOB HOLYOAKE

      G. J. Holyoake (1851) defined secularism: modern dictionaries cite/rephrase his definitions, but in reality (Benson found) that in his works, he's not at all neutral, but rather anti-religion - basing off of Comte, Rousseau

    1. oland:...the concept of religion shall in particular include: (a) having theistic, non-theistic oratheistic beliefs, (b) participation, or refraining from engaging in religious rituals, performedin public or private, individually or collectively, [and] (c) other acts of a religious character,beliefs expressed [in the form] of individual or collective behaviour as a result of religiousbeliefs or related to them.'

      definition in poland

    2. ustria: ‘for a religion there are minimum requirements concerning a statement ofbelief, rules for a way of life and a cult’;1°8 religion is a ‘structure of convictionswhose content is capable of representation [which] has been growing in history toexplain humankind and the world in its transcendent meaning and to accompany[this] with specific rites and symbols [giving] them orientation in accordance withbasic principles and doctrine’.

      definition in austria

    3. Denmark, religion is seen as ‘aspecifically formulated belief in the dependence of human beings on a power overthe human race [which] provides guidelines for human ethics and morality’.

      definition in denmark

    4. France: ‘a religion can be defined bythe convergence of two elements, an objective element, the existence of a commu-nity even limited, and a subjective element, a common faith’

      definition in france

    5. religious beliefor practice had to be linked to well-established faiths.”° However, in Germanjurisprudence today whether a belief or activity is religious is to be determinedobjectively by reference to ‘spiritual content and external appearance’;

      german criteria

    6. States of Europe do not generally define ‘religion’ in theirconstitutions or other formal legislation, but, rather, leave it to the courts todetermine whether something is ‘religion

      lack of definitions in state-level regulations

    1. in the end, Moens (somewhat surprisingly, since he pointed out all the evidence and ways that the guards could be found guilty) concluded that the trials resulted in "injustices masquerading as justice", mainly because the court failed to consider that applying West Germany critical tradition (based more on natural law), to East Germany soldiers, which were only familiar with the German tradition (based on Lutheran idea of total obedience to the law) was UNJUST

    2. so, normally 315 of Reunification Treaty prevents acts commited on East German soil prior to reunification from being punished, if they were not punishable under East German Law (kinda like our warunek podwójnej karalaności)

      BUT, here: - immunity does not apply where there was already West German Law (west german law applied to crimes on foreign soil if): 1. the acts are commited against German 2. the person that committed them becomes a resident of WG or comes to WG

      prof. Samson argues that EG became part of WG, so their law is applicable 7(2) (similarly, the people who were shot, were Germans, so 7(1))

      BUT it's a stretch, because EGs were considered foreigners by WG

    3. the BGH was reluctant to invoke natural law, because it's hard to define what the "minimum content" should be. instead, the relied on the international human rights - but they found that border shooting on itself does not violate the Covenant, only its excessive nature or unnecessary use violates it

    4. Suchimpositionmayresultin‘unjust’decisionsbecauseitinvolvestheapplicationofWestGermany’scriticaltraditiontoKastGermanconditions

      so basically, the critical tradition was imposed on the EG guards, which was unjust because this idea was foreign to them. they were still followin the German tradition, that requires total obedience to the law

    5. WestGermancitizenscouldchallengelawswhichareopposedtothefundamentalmoralvaluesofthecommunity.GermanjurisprudencesincetheSecondWorld WarinterpretedtherightsenshrinedintheBasicLawnotasgrantedby theConstitution,butasexisting beforeitandindependentlyofit

      critical tradition in West Germany

    6. theconsequences|ofdisobeyingimmorallawsmustbeconsideredbypeople.IFthedangersjresultingfromdisobediencesubstantiallyoutweighilsbenefits, peopleshouldchoose obedience.

      Aquinas on natural vs positive law

    7. imposition of the post-var West German critical tradition on Mast Germanborder guards who, regardless of the morality of the relevant orders orlaws, were undoubtedly imbued with the German tradition of unqualifiedobedience to the law.

      different traditions of thinking; the critical post-war tradition was applied to the EG soldiers, who followed the German tradition to obey the law

    8. internalmoralitydeals withtheminimurnconditionswhicheverymaturelegalsystem mustsatisfyinordertoachieveitspurpose.Theseconditions,whichareinherentintheconceptof ‘law’,includetherequirementsthatrulesmustbeprospective, mustnotbeconstantlychanging,andtheirimplementationbyofficialsmustnotbeperverted.

      internal morality = Fuller

    9. 2 possible justifications in EG law: 1. appealing to the fact that use of firearms is permittable if used against a serious crime (defined as carried out with dangerous means; court later found that ladders that were used to climb the Berlin Wall were considered 'dangerous means', thus use of firearms was necessary) 2. explicit statement that soldier who follow orders are not criminally responsible (unless a blatant violation)

    10. APPLICABILITY OF WG LAW milder law is applicable UNLESS there was already WG law at the time of the act - territoriality principle = WG law applicable to act in EG, if the consequences occured in WG - act was committed against a German permanently resided in WG - act carried out by West German - perpetrator moved to WG before the reunification

    1. reasons for temporariness of theocracies 1. lack of secular skills and means to run modern economy - by religious leaders 2. unwillingness to entertain the compromises of political + international relations

    2. Table 4.1 also includes the economic labels of monopoly, regulation,and competition. This captures the insights of recent literature applying simpleeconomic models to religion: should the state endorse a monopoly faith or is a‘free market’ in religion preferable?

      applying economic models to religions

    1. ad FAIRNESS 2 things have to be provided, before you can call a law "based on the moral concept of fairness" (to dispute the previous two arguments):

      1. laws have generally beneficial effects
      2. most other people obey the law (so that if you don't you benefit unfairly)
    2. COUNTER ARGUMENTS TO FORMER IDEAS:

      ad. 3 - fairness - anarchist: denial of any benefit coming from the law - those who claim it's beneficial: fairness can't apply, where obeying the law does no one any good, is useless (eg. it's not unfair to speed over the limit on a deserted road)

      ad 4. public good - act-utalitarianism: there are of course examples, where total obedience to law does more harm than good - eg. man who earns a small income from a service to a friend, he doesn't tax it - if he would, the public would benefit, but his family could starve to death

      BUT generally law-breaking sets a bad example (for the kids, others etc); its consequences include imitation of further law-breaking - rule-utilitarianism = it's difficult to compare all different rules and their consequences

    3. ALL FOUR CONCEPTS OF WHY YOU SHOULD FOLLOW THE LAW: 1. gratitude = your country & law was the source of great benefits for you, so you should at least obey the law but against, you could argue that you can be grateful to many people, but it doesn't mean you have to obey everything they say 2. promise-keeping: citizens promise to obey the law in exchange for protection & other benefits (kind of a "social contract" like in Rawls' theory) 3. fairness: different from promise-keeping, because it's extended to all citizens, as a moral ground to everyone, not just to those who choose to participate in the politics SO, you should obey the law, because it would be unfair not to; you owe your fellow citizens "if they all comply and you benefit, it is unfair if you benefit without complying" 4. public good = if people break the law, the welfare of society is diminished, thus we're all morally obliged to obey

    4. different forms of the utalitarian concept: - act-utalitarianism = an act is morally wrong if it'll have worse consequences than other acts possible on this occasion - "rule-utilitarianism": you should obey the law, if it's required by a rule, that leads to best consequences when observed objectively

    5. rule-utilitarianism’:anactionisrightifrequiredbyarule,wheregeneralobservanceof therulewouldhavebestconsequences.

      "rule-utilitarianism": you should obey the law, if it's required by a rule, that leads to best consequences when observed objectively

      BUT it's difficult to compare all different rules and their consequences

    6. utilitarian’

      most common justification of obeying the law: public good (there are of course examples, where total obedience to law does more harm than good - eg. man who earns a small income from a service to a friend, he doesn't tax it - if he would, the public would benefit, but his family could starve to death

      counter argument to ^ = act-utalitarianism: not taxing the income, sets a bad example (for the kids, others etc); its consequences include imitation of further law-breaking

    7. primafacie

      2 things have to be provided, before you can call a law "based on the moral concept of fairness" (to dispute the previous two arguments): 1. laws have generally beneficial effects 2. most other people obey the law (so that if you don't you benefit unfairly)

    8. answers

      to the argument of fairness: 1. anarchist: denial of any benefit coming from the law 2. those who claim it's beneficial: fairness can't apply, where obeying the law does no one any good, is useless (eg. it's not unfair to speed over the limit on a deserted road)

    9. fairness’

      duty to obey law comes from fairness: different from promise-keeping, because it's extended to all citizens, as a moral ground to everyone, not just to those who choose to participate in the politics SO, you should obey the law, because it would be unfair not to; you owe your fellow citizens "if they all comply and you benefit, it is unfair if you benefit without complying"

    10. promise-keeping

      duty to obey the law comes from promise-keeping: citizens promise to obey the law in exchange for protection & other benefits (kind of a "social contract" like in Rawls' theory)

    11. ifThaveinmylefthandabookofJubbjubbetiquette—somethingtotallywoknowntoyou—andinmyrighthandabookofyour country’slaw,and[announcethatIamgoingtoopeneachatrandom,willyou allowthattherearemoralreasonsindicatingobediencetowhatevercomesoutofmyrighthandwhichplainlydonotobtaininthecaseoftheleft-handbook?Orisyourconscienceequipoisedbetweenthetwobooks—thatis.untilyouhear theprescriptionreadout.thereisnowayofknowingwhethertherewillbemoral reasonstocomply?Andremember:thequestionisnotjustoneofprobabilities.Youmightallowthat.given yourpreviousacquaintance withEnglishla50chancethatw,thereismorethana50:somethingrequiredbyitissomethingwhichtherearemoralgroundsforperforming.Thatisnotenoug!h.For onetobe abletoaffirmthataprimafaciemoraldutytoobeyEnglishlawexists,onemustbewhatevercomesoutoftheEnglishlawbook,therearereasons(stateableHyrighttocomplyinadvance)whyitismoral—albeitthat.oncetheprescriptionjsknown,othermoralreasonsmaytellagainst.

      a good way of illustrating just how relative natural law (or the core principles) is. because at what point, to which extent and based on what, do we decide when to obey or disobey the law?