Hypothesis

15 Matching Annotations

Jan 2023
arxiv.org arxiv.org

2301.11305.pdf

15
1. ravenscroftj 29 Jan 2023
  
  in Public
  
  Figure 3. The average drop in log probability (perturbation discrep-ancy) after rephrasing a passage is consistently higher for model-generated passages than for human-written passages. Each plotshows the distribution of the perturbation discrepancy d (x, pθ , q)for human-written news articles and machine-generated arti-cles; of equal word length from models GPT-2 (1.5B), GPT-Neo-2.7B (Black et al., 2021), GPT-J (6B; Wang & Komatsuzaki (2021))and GPT-NeoX (20B; Black et al. (2022)). Human-written arti-cles are a sample of 500 XSum articles; machine-generated textis generated by prompting each model with the first 30 tokens ofeach XSum article, sampling from the raw conditional distribution.Discrepancies are estimated with 100 T5-3B samples.
  
  quite striking here is the fact that more powerful/larger models are more capable of generating unusual or "human-like" responses - looking at the overlap in log likelihoods
  
  chatgpt detecting gpt
2. ravenscroftj 29 Jan 2023
  
  in Public
  
  if we apply small perturbations to a passagex ∼ pθ , producing ̃x, the quantity log pθ (x) − log pθ ( ̃x)should be relatively large on average for machine-generatedsamples compared to human-written text.
  
  By applying small changes to text sample x, we should be able to find the log probs of x and the perturbed example and there should be a fairly big delta for machine generated examples.
  
  chatgpt detecting gpt
3. ravenscroftj 29 Jan 2023
  
  in Public
  
  As in prior work, we study a ‘white box’ setting (Gehrmannet al., 2019) in which the detector may evaluate the log prob-ability of a sample log pθ (x). The white box setting doesnot assume access to the model architecture or parameters.While most public APIs for LLMs (such as GPT-3) enablescoring text, some exceptions exist
  
  The authors assume white-box access to the log probability of a sample \(log p_{\Theta}(x)\) but do not require access to the model's actual architecture or weights.
  
  chatgpt detecting gpt
4. ravenscroftj 29 Jan 2023
  
  in Public
  
  Empirically, we find predictive entropy to be positively cor-related with passage fake-ness more often that not; there-fore, this baseline uses high average entropy in the model’spredictive distribution as a signal that a passage is machine-generated.
  
  this makes sense and aligns with the gltr - humans add more entropy to sentences by making unusual choices in vocabulary that a model would not.
  
  chatgpt detecting gpt
5. ravenscroftj 29 Jan 2023
  
  in Public
  
  We find that supervised detectors can provide similardetection performance to DetectGPT on in-distribution datalike English news, but perform significantly worse than zero-shot methods in the case of English scientific writing andfail altogether for German writing. T
  
  supervised detection methods fail on out of domain examples whereas detectgpt seems to be robust to changes in domain.
  
  chatgpt detecting gpt
6. ravenscroftj 29 Jan 2023
  
  in Public
  
  ex-tending DetectGPT to use ensembles of models for scoring,rather than a single model, may improve detection in theblack box setting
  
  DetectGPT could be extended to use ensembles of models allowing iot to work in black box settings where the log probs are unknown
  
  chatgpt detecting gpt
7. ravenscroftj 29 Jan 2023
  
  in Public
  
  hile in this work, we use off-the-shelfmask-filling models such as T5 and mT5 (for non-Englishlanguages), some domains may see reduced performanceif existing mask-filling models do not well represent thespace of meaningful rephrases, reducing the quality of thecurvature estimate.
  
  The approach requires access to language models that can meaningfully and accurately rephrase (perturbate) the outputs from the model under evaluation. If these things do not align then it may not work well.
  
  chatgpt detecting gpt
8. ravenscroftj 29 Jan 2023
  
  in Public
  
  For models be-hind APIs that do provide probabilities (such as GPT-3),evaluating probabilities nonetheless costs money.
  
  This does cost money to do for paid APIs and requires that log probs are made available.
  
  chatgpt detecting gpt
9. ravenscroftj 29 Jan 2023
  
  in Public
  
  We simulate human re-vision by replacing 5 word spans of the text with samplesfrom T5-3B until r% of the text has been replaced, andreport performance as r varies.
  
  I question the trustworthiness of this simulation - human edits are probably going to be more sporadic and random.
  
  chatgpt detecting gpt
10. ravenscroftj 29 Jan 2023
  
  in Public
  
  Figure 5. We simulate human edits to machine-generated text byreplacing varying fractions of model samples with T5-3B gener-ated text (masking out random five word spans until r% of text ismasked to simulate human edits to machine-generated text). Thefour top-performing methods all generally degrade in performancewith heavier revision, but DetectGPT is consistently most accurate.Experiment is conducted on the XSum dataset
  
  DetectGPT shows 95% AUROC for texts that have been modified by about 10% and this drops off to about 85% when text is changed up to 24%.
  
  chatgpt detecting gpt
11. ravenscroftj 29 Jan 2023
  
  in Public
  
  DetectGPT’s performancein particular is mostly unaffected by the change in languagefrom English to Germa
  
  Performance of this method is robust against changes between languages (e.g. English to German)
  
  chatgpt detecting gpt
12. ravenscroftj 29 Jan 2023
  
  in Public
  
  ecause the GPT-3 API does not provideaccess to the complete conditional distribution for each to-ken, we cannot compare to the rank, log rank, and entropy-based prior methods
  
  GPT-3 api does not expose the cond probs for each token so we can't compare to some of the prior methods. That seems to suggest that this method can be used with limited knowledge about the probabilities.
  
  chatgpt detecting gpt
13. ravenscroftj 29 Jan 2023
  
  in Public
  
  improving detection offake news articles generated by 20B parameterGPT-NeoX
  
  The authors test their approach on GPT-NeoX. The question would be whether we can get hold of the log probs from ChatGPT to do the same
  
  chatgpt detecting gpt
14. ravenscroftj 29 Jan 2023
  
  in Public
  
  his approach, which we call DetectGPT,does not require training a separate classifier, col-lecting a dataset of real or generated passages, orexplicitly watermarking generated text. It usesonly log probabilities computed by the model ofinterest and random perturbations of the passagefrom another generic pre-trained language model(e.g, T5)
  
  The novelty of this approach is that it is cheap to set up as long as you have the log probabilities generated by the model of interest.
  
  chatgpt detecting gpt
15. ravenscroftj 29 Jan 2023
  
  in Public
  
  See ericmitchell.ai/detectgptfor code, data, and other project information.
  
  Code and data available at https://ericmitchell.ai/detectgpt
  
  chatgpt detecting gpt
Visit annotations in context

Tags

chatgpt

detecting gpt

Annotators

ravenscroftj

URL

arxiv.org/pdf/2301.11305.pdf

Tags

Annotators

URL