939 Matching Annotations
  1. Jan 2023
    1. The problem with this approach is that the emulation slows down your runtime. How much slower it is? Once benchmark I ran was only 6% of the speed of the host machine!

      Speed is the core problem with emulation

    2. The other option is to run x86_64 Docker images on your ARM64 Mac machine, using emulation. Docker is packaged with software that will translate or emulate x86_64 machine code into ARM64 machine code on the fly; it’s slow, but the code will run.

      Another possible solution for M1 users (see snippets below)

    3. Third, you can pre-compile wheels, store them somewhere, and install those directly instead of downloading the packages from PyPI.

      Third possible solution for M1 users

    4. If you have a compiler installed in your Docker image and any required native libraries and development headers, you can compile a native package from the source code. Basically, you add a RUN apt-get upgrade && apt-get install -y gcc and iterate until the package compiles successfully.

      Second possible solution for M1 users

    5. First, it’s possible the relevant wheels are available in newer versions of the libraries.

      First possible solution for M1 users

    6. When you pip install murmurhash==1.0.6 on a M1/M2 Mac inside Docker, again it looks at the available files

      Other possible steps that pip will do when trying to install a Python package without a relevant CPU instruction set

    7. When you pip install filprofiler==2022.05.0 on a M1/M2 Mac inside Docker, pip will look at the available files for that version, and then

      3 steps that pip will do when trying to install a Python package without a relevant CPU instruction set

    8. In either case, pure Python will Just Work, because it’s interpreted at runtime: there’s no CPU-specific machine code, it’s just text that the Python interpreter knows how to run. The problems start when we start using compiled Python extensions. These are machine code, and therefore you need a version that is specific to your particular CPU instruction set.

      M1 Python issues

    1. Data Viz with Python and RLearn to Make Plots in Python and R

      data viz with python and R

    1. Here's my opinion, having written many thousands of lines of mypy code.

      Negative opinion on mypy (see below this annotation)

  2. Dec 2022
    1. Overall, the code was significantly shorter compared to the tkinter version I did last year. That version had a few more features, but I'd say Textual felt much easier to reason about.

      Textual is much easier than tkinter.

    1. Or more directly

      Hack for pasting multiline Python scripts in a terminal:

      1. python
      2. exec('''<paste code>''')
      3. [ENTER]
    1. In Python, everything is an object – integers, strings, lists, functions, even classes themselves.
    1. Try this code at app startup:

      Code to improve Python GC settings to increase the performance by 20%

    2. The trigger is when you allocate 700 or more container objects (classes, dicts, tuples, lists, etc) more than have been cleaned up, a GC cycle runs.

      Trigger for GC runs in Python

    1. To summarize the three options we’ve seen, as well as a streaming ijson-based solution:

      Comparison of 4 Python's JSON libraries

    1. Always start with functionsGrow to classes once you feel you can group different subsets of functions

      Python rules for creating a function or a class

    2. First of all, in Python there are no such things as "files" and I noticed this is the main source of confusion for beginners.If you're inside a directory that contains any __init__.py it's a directory composed of modules, not files.

      On "files" in Python

    1. the fact that the Poetry developers intentionally introduced failures to people’s CI/CD pipelines to motivate them to move away from Poetry’s legacy installer… Though we didn’t rely on the installer in our pipelines, this was the death knell for the continued use of Poetry.

      Video on this topic: https://youtu.be/Gr9o8MW_pb0

  3. Nov 2022
    1. There are plenty of articles about the emergence of PyScript for embedding Python code directly into HTML, but until now the creation of browser extensions in Python has been something of a closed door.

      One can use PyScript to write browser extensions in Python (or at least some simple ones?)

    1. notice that defaultdict not only returns the default value, but also assigns it to the key that wasn't there before:

      See example below about defaultdict

    2. we might need a dictionary subclass, and then we need to access a key that does not exist in that dictionary

      Example of applying __missing__ dunder method:

      ```python class DictSubclass(dict): def missing(self, key): print("Hello, world!")

      my_dict = DictSubclass() my_dict["this key isn't available"]

      Hello, world!

      ```

    3. The table also includes links to the documentation of the dunder method under the emoji 🔗. When available, relevant Pydon'ts are linked under the emoji 🗒️.

      Table below lists Python dunder methods

    4. >>> 3 in my_list False >>> my_list.__contains__(3) False

      python 3 in my_list is the same as: python my_list.__contains__(3)

    5. “dunder” comes from “double underscore”

      dunder = double underscores (__)

    6. dunder methods are methods that allow instances of a class to interact with the built-in functions and operators

      Python's dunder methods

  4. Oct 2022
    1. I'm afraid you missed the joke ;-) While you believe spaces are required on both sides of an em dash, there is no consensus on this point. For example, most (but not all) American authorities say /no/ spaces should be used. That's the joke. In writing a line about "only one way to do it", I used a device (em dash) for which at least two ways to do it (with spaces, without spaces) are commonly used, neither of which is obvious -- and deliberately picked a third way just to rub it in. This will never change ;-)
    2. This text has a line which has an ortographical typo in it. Please look at this line of text from the Zen of Python: There should be one-- and preferably only one --obvious way to do it.

      first sighting: ortographical

    1. Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one– and preferably only one –obvious way to do it.[a] Although that way may not be obvious at first unless you're Dutch. Now is better than never. Although never is often better than right now.[b] If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. Namespaces are one honking great idea – let's do more of those!
    1. Python is known for using more memory than more optimized languages and, in this case, it uses 7 times more than PostgresML.
    2. PostgresML outperforms traditional Python microservices by a factor of 8 in local tests and by a factor of 40 on AWS EC2.
    1. You may need pathname2url

      (Author:: [[neurino on Stack Overflow]]) python from urllib.request import pathname2url pathname2url('dir/foo, bar.mp3')

  5. Sep 2022
    1. Mamba installs these packages in only a third of the time that Conda does. Much of that is due to less CPU usage, but even network downloads seem to be little faster; Mamba uses parallel downloads to speed them up.

      Mamba is a lot faster than Conda

    1. So which should you use, pip or Conda? For general Python computing, pip and PyPI are usually fine, and the surrounding tooling tends to be better. For data science or scientific computing, however, Conda’s ability to package third-party libraries, and the centralized infrastructure provided by Conda-Forge, means setup of complex packages will often be easier.

      From my experience, I would use Mambaforge or pyenv and Poetry.

    1. Errors detected during execution are called exceptions and are not unconditionally fatal: you will soon learn how to handle them in Python programs.

      exceptions

    2. There are (at least) two distinguishable kinds of errors: syntax errors and exceptions.

      kind of errors

    1. and

      as

      assert

      break

      class

      continue

      def

      del

      elif

      else

      except

      exec

      finally

      for

      from

      global

      if

      import

      in

      is

      lambda

      nonlocal

      not

      or

      pass

      raise

      return

      try

      while

      with

      yield

      True

      False

      None

    1. Hedy Een graduele programmeertaal Probeer het uit

      Ik kwam deze programmeertaal tegen via een podcastaflevering van Pom.https://podimo.com/nl/shows/99aa420b-14d0-4ffc-8e79-a55ed8f793e4/episode/363c968e-7fd6-4649-9ab9-46b123a837d9?creatorId=8617f4d8-7b12-4cf4-9a90-71036b3f8edf&key=tR65wldt702F&source=ln&from=mobile&utmSource=$2a$10$6rBa4aroNNZ3pA6Qops54.xSeaCoSXkXBw2YWXfoNoEHrjz.bm42G&variant=enabled_v2 Techoptimisme #2 - Een oplossing voor het IT-tekort Een standbeeld verdient ze, zo enthousiast zijn Alexander en Ernst-Jan over Felienne Hermans. Zij is programmeur en universitair hoofddocent en zij heeft Hedy ontwikkeld, een programmeertaal voor kinderen.

      En dat is heel goed nieuws, want als meer kinderen leren programmeren dan krijgen we een generatie volwassenen die veel meer digitaal geletterd is dan de beleidsmakers, politici en docenten die we nu hebben.

      Hoe bijzonder het is dat deze taal bestaat, legt Felienne met aanstekelijk enthousiasme uit. Met Hedy leer je stapje voor stapje programmeren totdat je na een jaar Python kan. Felienne vertelt hoe ze dit flikte en droomt hardop over een manier om zoveel mogelijk kinderen te leren programmeren.

    1. # "func" called 3 times result = [func(x), func(x)**2, func(x)**3] # Reuse result of "func" without splitting the code into multiple lines result = [y := func(x), y**2, y**3]

      Smart example of using the walrus operator :=

  6. Aug 2022
    1. Formalización de la Secuencia Perezosa - Evaluación perezosa en python - Parte 5

      Formalización de la Secuencia Perezosa - Evaluación perezosa en Python - Parte 5

    1. Evaluación perezosa avanzada - Evaluación perezosa en python - Parte 4

      Evaluación perezosa avanzada - Evaluación perezosa en Python - Parte 4

    1. Every beginner-level tutorial for scientists should state during the first five minutes that you cannot expect stability and that you should either use Python only for throw-away code or else
    1. What is not OK is what I perceive as the dominant attitude today: sell SciPy as a great easy-to-use tool for all scientists, and then, when people get bitten by breaking changes, tell them that it’s their fault for not having a solid maintenance plan for their code.
    1. Count Occurrences of Each Character in a String in Python

      Presentación de distintas formas de contar el número de ocurrencias de elementos en una cadena.

    1. Getting started with lsp-mode for Python

      Explica la instalación de LSP mode para Python

    1. Early notes on using the new python-lsp-server (pylsp) in GNU Emacs

      Explica la transición de Elpy a LSP-mode

  7. Jul 2022
    1. when you use python -m pip with python being the specific interpreter you want to use, all of the above ambiguity is gone. If I say python3.8 -m pip then I know pip will be using and installing for my Python 3.8 interpreter (same goes for if I had said python3.7).

      It's better to use python -m pip over pip / pip3 to be sure for which Python version we're installing the dependencies.

      However, it's not necessary when using environments.

    2. And if you're on Windows there is an added benefit to using python -m pip as it lets pip update itself. Basically because pip.exe is considered running when you do pip install --upgrade pip, Windows won't let you overwrite pip.exe. But if you do python -m pip install --upgrade pip you avoid that issue as it's python.exe that's running, not pip.exe.

      If you would like to update pip on Windows, use python -m pip install --upgrade pip

    1. It’s time to say goodbye to distutils package and switch to setuptools.

      Use setuptools over distutils

    2. as soon as you switch to Python 3.11, you should get into habit of using import tomllib instead of import tomli

      tomlib

    3. It's fine to use print if you're debugging an issue locally, but for any production-ready program that will run without user intervention, proper logging is a must.

      In production, use logging instead of print

    4. Finally, if you don’t use either namedtuple nor dataclasses you might want to consider going straight to Pydantic.
    5. You might be wondering why would you need to replace namedtuple? So, these are some reasons why you should consider switching to dataclasses

      There are a number of reasons why to prefer dataclasses over namedtuple

    6. Using zoneinfo however has one caveat - it assumes that there's time zone data available on the system, which is the case on UNIX systems. If your system doesn't have timezone data though, then you should use tzdata package which is a first-party library maintained by the CPython core developers, which contains IANA time zone database.

      One caveat of zoneinfo

    7. Until Python 3.9, there wasn’t builtin library for timezone manipulation, so everyone was using pytz, but now we have zoneinfo in standard library, so it's time to switch!

      Prefer zoneinfo over pytz from Python 3.9

    8. As per docs, random module should not be used for security purposes. You should use either secrets or os.urandom, but the secrets module is definitely preferable, considering that it's newer and includes some utility/convenience methods for hexadecimal tokens as well as URL safe tokens.

      Prefer secrets module over os.urandom

    9. pathlib has however many advantages over old os.path - while os module represents paths in raw string format, pathlib uses object-oriented style, which makes it more readable and natural to write
    1. If you need to store duplicates, go for List or Tuple.For List vs. Tuple, if you do not intend to mutate, go for Tuple.If you do not need to store duplicates, always go for Set or Dictionary. Hash maps are significantly faster when it comes to determining if an object is present in the Set (e.g. x in set_or_dict).

      Python list vs tuple vs set

    1. O nome teve a sua origem no grupo humorístico britânico Monty Python, criador do programa Monty Pythons Flying Circus, embora muitas pessoas façam associação com o réptil do mesmo nome.

      020722 224423 sáb. R15. BH<br /> o R.

    1. ```python doi_regexp = re.compile( r"(doi:\s|(?:https?://)?(?:dx.)?doi.org/)?(10.\d+(.\d+)/.+)$", flags=re.I ) """See http://en.wikipedia.org/wiki/Digital_object_identifier."""

      handle_regexp = re.compile( r"(hdl:\s|(?:https?://)?hdl.handle.net/)?" r"([^/.]+(.[^/.]+)/.)$", flags=re.I ) """See http://handle.net/rfc/rfc3651.html. <Handle> = <NamingAuthority> "/" <LocalName> <NamingAuthority> = (<NamingAuthority> ".") <NAsegment> <NAsegment> = Any UTF8 char except "/" and "." <LocalName> = Any UTF8 char """

      arxiv_post_2007_regexp = re.compile(r"(arxiv:)?(\d{4}).(\d{4,5})(v\d+)?$", flags=re.I) """See http://arxiv.org/help/arxiv_identifier and http://arxiv.org/help/arxiv_identifier_for_services."""

      arxiv_pre_2007_regexp = re.compile( r"(arxiv:)?([a-z-]+)(.[a-z]{2})?(/\d{4})(\d+)(v\d+)?$", flags=re.I ) """See http://arxiv.org/help/arxiv_identifier and http://arxiv.org/help/arxiv_identifier_for_services."""

      arxiv_post_2007_with_class_regexp = re.compile( r"(arxiv:)?(?:[a-z-]+)(?:.[a-z]{2})?/(\d{4}).(\d{4,5})(v\d+)?$", flags=re.I ) """Matches new style arXiv ID, with an old-style class specification; technically malformed, however appears in real data."""

      hal_regexp = re.compile(r"(hal:|HAL:)?([a-z]{3}[a-z]*-|(sic|mem|ijn)_)\d{8}(v\d+)?$") """Matches HAL identifiers (sic mem and ijn are old identifiers form)."""

      ads_regexp = re.compile(r"(ads:|ADS:)?(\d{4}[A-Za-z]\S{13}[A-Za-z.:])$") """See http://adsabs.harvard.edu/abs_doc/help_pages/data.html"""

      pmcid_regexp = re.compile(r"PMC\d+$", flags=re.I) """PubMed Central ID regular expression."""

      pmid_regexp = re.compile( r"(pmid:|https?://pubmed.ncbi.nlm.nih.gov/)?(\d+)/?$", flags=re.I ) """PubMed ID regular expression."""

      ark_suffix_regexp = re.compile(r"ark:/[0-9bcdfghjkmnpqrstvwxz]+/.+$") """See http://en.wikipedia.org/wiki/Archival_Resource_Key and https://confluence.ucop.edu/display/Curation/ARK."""

      lsid_regexp = re.compile(r"urn:lsid:[^:]+(:[^:]+){2,3}$", flags=re.I) """See http://en.wikipedia.org/wiki/LSID."""

      orcid_urls = ["http://orcid.org/", "https://orcid.org/"]

      gnd_regexp = re.compile( r"(gnd:|GND:)?(" r"(1|10)\d{7}[0-9X]|" r"[47]\d{6}-\d|" r"[1-9]\d{0,7}-[0-9X]|" r"3\d{7}[0-9X]" r")" ) """See https://www.wikidata.org/wiki/Property:P227."""

      gnd_resolver_url = "http://d-nb.info/gnd/"

      sra_regexp = re.compile(r"[SED]R[APRSXZ]\d+$") """Sequence Read Archive regular expression. See https://www.ncbi.nlm.nih.gov/books/NBK56913/#search.what_do_the_different_sra_accessi """

      bioproject_regexp = re.compile(r"PRJ(NA|EA|EB|DB)\d+$") """BioProject regular expression. See https://www.ddbj.nig.ac.jp/bioproject/faq-e.html#project-accession https://www.ebi.ac.uk/ena/submit/project-format https://www.ncbi.nlm.nih.gov/bioproject/docs/faq/#under-what-circumstances-is-it-n """

      biosample_regexp = re.compile(r"SAM(N|EA|D)\d+$") """BioSample regular expression. See https://www.ddbj.nig.ac.jp/biosample/faq-e.html https://ena-docs.readthedocs.io/en/latest/submit/samples/programmatic.html#accession-numbers-in-the-receipt-xml https://www.ncbi.nlm.nih.gov/biosample/docs/submission/faq/ """

      ensembl_regexp = re.compile( r"({prefixes})(E|FM|G|GT|P|R|T)\d{{11}}$".format( prefixes="|".join(ENSEMBL_PREFIXES) ) ) """Ensembl regular expression. See https://asia.ensembl.org/info/genome/stable_ids/prefixes.html """

      uniprot_regexp = re.compile( r"([A-NR-Z]0-9{1,2})|" r"([OPQ][0-9][A-Z0-9]{3}[0-9])(.\d+)?$" ) """UniProt regular expression. See https://www.uniprot.org/help/accession_numbers """

      refseq_regexp = re.compile( r"((AC|NC|NG|NT|NW|NM|NR|XM|XR|AP|NP|YP|XP|WP)|" r"NZ[A-Z]{4})\d+(.\d+)?$" ) """RefSeq regular expression. See https://academic.oup.com/nar/article/44/D1/D733/2502674 (Table 1) """

      genome_regexp = re.compile(r"GC[AF]_\d+.\d+$") """GenBank or RefSeq genome assembly accession. See https://www.ebi.ac.uk/ena/browse/genome-assembly-database """

      geo_regexp = re.compile(r"G(PL|SM|SE|DS)\d+$") """Gene Expression Omnibus (GEO) accession. See https://www.ncbi.nlm.nih.gov/geo/info/overview.html#org """

      arrayexpress_array_regexp = re.compile( r"A-({codes})-\d+$".format(codes="|".join(ARRAYEXPRESS_CODES)) ) """ArrayExpress array accession. See https://www.ebi.ac.uk/arrayexpress/help/accession_codes.html """

      arrayexpress_experiment_regexp = re.compile( r"E-({codes})-\d+$".format(codes="|".join(ARRAYEXPRESS_CODES)) ) """ArrayExpress array accession. See https://www.ebi.ac.uk/arrayexpress/help/accession_codes.html """

      ascl_regexp = re.compile(r"^ascl:[0-9]{4}.[0-9]{3,4}$", flags=re.I) """ASCL regular expression."""

      swh_regexp = re.compile( r"swh:1:(cnt|dir|rel|rev|snp):[0-9a-f]{40}" r"(;(origin|visit|anchor|path|lines)=\S+)*$" ) """Matches Software Heritage identifiers."""

      ror_regexp = re.compile(r"(?:https?://)?(?:ror.org/)?(0\w{6}\d{2})$", flags=re.I) """See https://ror.org/facts/#core-components.""" ```

  8. Jun 2022
  9. fastapi.tiangolo.com fastapi.tiangolo.com
    1. @app.get("/items/{item_id}") def read_item(item_id: int, q: Union[str, None] = None): return {"item_id": item_id, "q": q}

      Con la siguiente url, por ejemplo http://127.0.0.1:8000/items/1?q=hola

      devuelve:

      { "item_id": 1, "q": "hola" }

    1. Python 3.11 is up to 10-60% faster than Python 3.10. On average, we measured a 1.25x speedup on the standard benchmark suite. See Faster CPython for details.

      On the speed of Python 3.11

  10. May 2022
    1. __init__.py is required to import the directory as a package, and should be empty.

      to import the directory as a package

    1. Pyenv works by adding a special directory called shims in front of your PATH environment variable

      How pyenv works

    2. If you are on Linux, you can simply download it from GitHub but the most convenient way is to use the pyenv-installer that is a simple script that will install it automatically on your distro, whatever it is, in the easiest possible way.

      Installing pyenv on Linux

    1. Without accounting for what we install or add inside, the base python:3.8.6-buster weighs 882MB vs 113MB for the slim version. Of course it's at the expense of many tools such as build toolchains3 but you probably don't need them in your production image.4 Your ops teams should be happier with these lighter images: less attack surface, less code that can break, less transfer time, less disk space used, ... And our Dockerfile is still readable so it should be easy to maintain.

      See sample Dockerfile above this annotation (below there is a version tweaked even further)

  11. Apr 2022
    1. In the previous version, using the standard library, once the data is loaded we no longer to keep the file open. With this API the file has to stay open because the JSON parser is reading from the file on demand, as we iterate over the records.

      For ijson.items(), the peak tracked memory usage was 3.6 MiB for a large JSON, instead of 124.7 MiB as for the standard json.load()

    2. One common solution is streaming parsing, aka lazy parsing, iterative parsing, or chunked processing.

      Solution for processing large JSON files in Python

    3. Then, if the string can be represented as ASCII, only one byte of memory is used per character. If the string uses more extended characters, it might end up using as many as 4 bytes per character. We can see how much memory an object needs using sys.getsizeof()

      "a" takes less bytes than "❄", which takes less bytes than "💵"

    1. wik2dict is a tool written in Python that converts MediaWiki SQL dumps into the DICT format. The script is available under the GNU General Public License. It is also capable of downloading Wikipedia, Wiktionary, Wikiquote, Wikinews and Wikibooks SQL dumps.
    1. Using named arguments is nice for languages that support it, but this is not always a possibility. Even in Python, where time.sleep is defined with a single argument named secs, we can’t call sleep(secs=300) due to implementation reasons. In that case, we can give the value a name instead.Instead of this:time.sleep(300)Do this:sleep_seconds = 300 time.sleep(sleep_seconds)Now the code is unambiguous, and readable without having to consult the documentation.

      Putting units in variable names

  12. Mar 2022
    1. for debugging purposes, a good combination is --lf --trace which would start a debug session with pdb at the beginning of the last test that failed:

      pytest --lf --trace

    2. pytest -l

      Show values of local variables in the output with -l

    3. If you start pytest with --pdb, it will start a pdb debugging session right after an exception is raised in your test. Most of the time this is not particularly useful as you might want to inspect each line of code before the raised exception.

      The --pdb option for pytest

    4. pytest --lf

      Run the last failed test only with --lf

      Run all tests, but run the last failed ones first with --ff

    5. pytest -x

      Exiting on the 1st error with -x

    6. pytest --collect-only

      Collecting Pytests (not running them)

    7. pytest test_validator.py::test_regular_email_validates

      Example of running just one test (test_regular_email_validates) from test_validator.py

    8. Apart from shared fixtures you could place external hooks and plugins or modifiers for the PATH used by pytest to discover tests and implementation code.

      Additional things to store in conftest.py

    9. pytest can read its project-specific configuration from one of these files: pytest.ini tox.ini setup.cfg

      3 options for configuring pytest

    10. To have the fixture actually be used by one of your test, you simply add the fixture’s name as an argument

      Example:

      ```python ​import​ pytest

      @pytest.fixture() def database_environment(): setup_database() yield teardown_database()

      def test_world(database_environment): assert 1 == 1 ```

    1. But the problem with Poetry is arguably down to the way Docker’s build works: Dockerfiles are essentially glorified shell scripts, and the build system semantic units are files and complete command runs. There is no way in a normal Docker build to access the actually relevant semantic information: in a better build system, you’d only re-install the changed dependencies, not reinstall all dependencies anytime the list changed. Hopefully someday a better build system will eventually replace the Docker default. Until then, it’s square pegs into round holes.

      Problem with Poetry/Docker

    2. Third, you can use poetry-dynamic-versioning, a plug-in for Poetry that uses Git tags instead of pyproject.toml to set your application’s version. That way you won’t have to edit pyproject.toml to update the version. This seems appealing until you realize you now need to copy .git into your Docker build, which has its own downsides, like larger images unless you’re using multi-stage builds.

      Approach of using poetry-dynamic-versioning plugin

    3. But if you’re doing some sort of continuous deployment process where you’re continuously updating the version field, your Docker builds are going to be slow.

      Be careful when updating the version field of pyproject.toml around Docker

    1. VCR.py works primarily via the @vcr decorator. You can import this decorator by writing: import vcr.

      How VCR.py works

    2. The VCR.py library records the responses from HTTP requests made within your unit tests. The first time you run your tests using VCR.py is like any previous run. But the after VCR.py has had the chance to run once and record, all subsequent tests are:Fast! No more waiting for slow HTTP requests and responses in your tests.Deterministic. Every test is repeatable since they run off of previously recorded responses.Offline-capable! Every test can now run offline.

      VCR.py library to speed up Python HTTP tests

  13. Feb 2022
    1. 安装多版本 Python

      Windows 安装 Python 2.7 和 Python 3,如何设置环境变量中的系统变量,而可以分别执行不同的版本?

      1、安装 Python 2.7 后 1.1 先将 python.exe 命名为 python2.exe。 1.2 将python2.exe 所安装的路径,添加到系统变量中.

      2、安装 Python 3.x 后 2.1 重命名,将安装路径下的 python.exe,命名为 python3.exe 2.2 将该 python3.exe 所在的路径,添加到系统变量中

    1. 每月最多只能免费处理 1TB 的数据。如果需要更多则必须每月至少支付 49 美元。1TB/月对于测试工具和个人项目可能绰绰有余,但如果你需要它来实际公司使用,肯定是要付费的。

      需要花钱。这让我有点退却。 https://www.terality.com/

    1. PyCaret是Python中的一个开源、低代码机器学习库,旨在减少从数据处理到模型部署的周期时间。
  14. Jan 2022
    1. An extension to python markdown that takes metadata embedded as YAML in a page of markdown and render it as JSON-LD in the HTML created by MkDocs.
      • YAML input

        "@context": "http://schema.org"
        "@id": "#lesson1"
        "@type":
          - CreativeWork
        learningResourceType: LessonPlan
        hasPart: {
        "@id": "#activity1"
        }
        author:
          "@type": Person
          name: Phil Barker
        
      • Default JSON-LD output

        <script type="application/ld+json">
        { "@context":  "http://schema.org",
        "@id": "#lesson1",
        "@type":["CreativeWork"],
        "learningResourceType": "LessonPlan",
        "name": "Practice Counting Strategies",
        "hasPart": {
          "@id": "#activity1-1"
        }
        "author": {
          "@type": "Person"
          "name": "Phil Barker"
        }
        }
        </script>
        
    1. The metadata that we use for OCX is a profile of schema.org / LRMI,  OERSchema and few bits that we have added because we couldn’t find them elsewhere. Here’s what (mostly) schema.org metadata looks like in YAML:
      "@context":
          - "http://schema.org"
          - "oer": "http://oerschema.org/"
          - "ocx": "https://github.com/K12OCX/k12ocx-specs/"
      "@id": "#Lesson1"
      "@type":
          - oer:Lesson
          - CreativeWork
      learningResourceType: LessonPlan
      hasPart:
        "@id": "#activity1-1"
      author:
          "@type": Person
          name: Phil Barker
      
    2. I’ve been experimenting with ways of putting JSON-LD schema.org metadata into HTML created by MkDocs. The result is a python-markdown plugin that will (hopefully) find blocks of YAML in markdown and insert then into the HTML that is generated.
    1. Python | sep parameter in print()Difficulty Level : EasyLast Updated : 21 Jan, 2021The separator between the arguments to print() function in Python is space by default (softspace feature) , which can be modified and can be made to any character, integer or string as per our choice. The ‘sep’ parameter is used to achieve the same, it is found only in python 3.x or later. It is also used for formatting the output strings.

      Gute Idee für Passstring um auf einfache Weise Text mit beliebigen Separatoren zu trennen

    1. A best practice among Python developers is to avoid installing packages into a global interpreter environment. You instead use a project-specific virtual environment that contains a copy of a global interpreter. Once you activate that environment, any packages you then install are isolated from other environments. Such isolation reduces many complications that can arise from conflicting package versions. To create a virtual environment and install the required packages, enter the following commands as appropriate for your operating system:
    1. Instead of “I have a type, it’s called MyType, it has a constructor, in the constructor I assign the property ‘A’ to the parameter ‘A’ (and so on)”, you say “I have a type, it’s called MyType, it has an attribute called a”

      How class declariation in Plain Old Python compares to attr

    2. attrs lets you declare the fields on your class, along with lots of potentially interesting metadata about them, and then get that metadata back out.

      Essence on what attr does

    3. >>> Point3D(1, 2, 3) == Point3D(1, 2, 3)

      attr library includes value comparison and does not require an explicit implementation:

          def __eq__(self, other):
              if not isinstance(other, self.__class__):
                  return NotImplemented
              return (self.x, self.y, self.z) == (other.x, other.y, other.z)
          def __lt__(self, other):
              if not isinstance(other, self.__class__):
                  return NotImplemented
              return (self.x, self.y, self.z) < (other.x, other.y, other.z)
      
    4. >>> Point3D(1, 2, 3)

      attr library includes string representation and does not require an explicit implementation:

      def __repr__(self):
          return (self.__class__.__name__ +
              ("(x={}, y={}, z={})".format(self.x, self.y, self.z)))
      
    5. Look, no inheritance! By using a class decorator, Point3D remains a Plain Old Python Class (albeit with some helpful double-underscore methods tacked on, as we’ll see momentarily).

      attr library removes a lot of boilerplate code when defining Python classes, and includes such features as string representation or value comparison.

      Example of a Plain Old Python Class:

      class Point3D(object):
          def __init__(self, x, y, z):
              self.x = x
              self.y = y
              self.z = z
      

      Example of a Python class defined with attr:

      import attr
      @attr.s
      class Point3D(object):
          x = attr.ib()
          y = attr.ib()
          z = attr.ib()
      
  15. Dec 2021
    1. import warc
      
      from StringIO import StringIO
      from httplib import HTTPResponse
      
      class FakeSocket():
          def __init__(self, response_str):
              self._file = StringIO(response_str)
          def makefile(self, *args, **kwargs):
              return self._file
      
      for record in warc.open("eada.warc.gz"):
          if record.type == "response":
              resp = HTTPResponse(FakeSocket(record.payload.read()))
              resp.begin()
              if resp.getheader("content-type") == "text/html":
                  print record['WARC-Target-URI']
      

      I sorted the output and came up with a nice list of URLs for the website. Here is a brief snippet:

      http://mith.umd.edu/eada/gateway/winslow.php
      http://mith.umd.edu/eada/gateway/winthrop.php
      http://mith.umd.edu/eada/gateway/witchcraft.php
      http://mith.umd.edu/eada/gateway/wood.php
      http://mith.umd.edu/eada/gateway/woolman.php
      http://mith.umd.edu/eada/gateway/yeardley.php
      http://mith.umd.edu/eada/guesteditors.php
      http://mith.umd.edu/eada/html/display.php?docs=acrelius_founding.xml&action=show
      http://mith.umd.edu/eada/html/display.php?docs=alsop_character.xml&action=show
      http://mith.umd.edu/eada/html/display.php?docs=arabic.xml&action=show
      http://mith.umd.edu/eada/html/display.php?docs=ashbridge_account.xml&action=show
      http://mith.umd.edu/eada/html/display.php?docs=banneker_letter.xml&action=show
      http://mith.umd.edu/eada/html/display.php?docs=barlow_anarchiad.xml&action=show
      http://mith.umd.edu/eada/html/display.php?docs=barlow_conspiracy.xml&action=show
      http://mith.umd.edu/eada/html/display.php?docs=barlow_vision.xml&action=show
      http://mith.umd.edu/eada/html/display.php?docs=barlowe_voyage.xml&action=show
      
    1. 基于nude(裸露程度)的色情图片识别 nudepy 这个库基本上可以视为上述方法的威力加强版 库内通过c语言实现了一个皮肤分类器,并基于较复杂的裸露程度来判别图片是否是色情图片 说明程序入口见pic_classify_nude.py,这里主要是对于nude库的封装 >>>from pic_classify_nude import test >>> >>>test('1.png') # 判断色情图片T/F True

      还不错的入门级方案。

    1. 100 000+ datapoints). This library solves this by downsampling the signal for the currently selected time window and then plotting the downsampled points.

      Optimization plotting library.

  16. Nov 2021
    1. I’d probably choose the official Docker Python image (python:3.9-slim-bullseye) just to ensure the latest bugfixes are always available.

      python:3.9-slim-bullseye may be the sweet spot for a Python Docker image

    2. So which should you use? If you’re a RedHat shop, you’ll want to use their image. If you want the absolute latest bugfix version of Python, or a wide variety of versions, the official Docker Python image is your best bet. If you care about performance, Debian 11 or Ubuntu 20.04 will give you one of the fastest builds of Python; Ubuntu does better on point releases, but will have slightly larger images (see above). The difference is at most 10% though, and many applications are not bottlenecked on Python performance.

      Choosing the best Python base Docker image depends on different factors.

    3. There are three major operating systems that roughly meet the above criteria: Debian “Bullseye” 11, Ubuntu 20.04 LTS, and RedHat Enterprise Linux 8.

      3 candidates for the best Python base Docker image

    1. If we call this using Bash, it never gets further than the exec line, and when called using Python it will print lol as that's the only effective Python statement in that file.
      #!/bin/bash
      "exec" "python" "myscript.py" "$@"
      print("lol")
      
    2. For Python the variable assignment is just a var with a weird string, for Bash it gets executed and we store the result.

      __PYTHON="$(command -v python3 || command -v python)"

    1. x() is the same as doing x.__call__()
    2. How do you even begin to check if you can try and “call” a function, class, and whatnot? The answer is actually quite simple: You just see if the object implements the __call__ special method.

      Use of __call__

    3. Python is referred to as a “duck-typed” language. What it means is that instead of caring about the exact class an object comes from, Python code generally tends to check instead if the object can satisfy certain behaviours that we are looking for.
    4. everything is stored inside dictionaries. And the vars method exposes the variables stored inside objects and classes.

      Python stores objects, their variables, methods and such inside dictionaries, which can be checked using vars()

  17. Oct 2021
    1. >>> page = Page.objects.get(title="A Blog post") >>> page <Page: A Blog post> # Note: the blog post is an instance of Page so we cannot access body, date or feed_image >>> page.specific <BlogPage: A Blog post>

      You can convert a Page object to its more specific user-defined equivalent using the .specific property. This may cause an additional database lookup.

    1. Use settings to change the default templates used for each tag Specify templates using template and sub_menu_template arguments for any of the included menu tags (See Specifying menu templates using template tag parameters). Put your templates in a preferred location within your project and wagtailmenus will pick them up automatically (See Using preferred paths and names for your templates).

      Dónde especificar las plantillas para los menús. Si no usas las tuyas, el paquete usa plantillas por defecto usando bootstrap3

    2. While main menus always have to be defined for each site, for flat menus, you can support multiple sites using any of the following approaches: Define a new menu for each site Define a menu for your default site and reuse it for the others Create new menus for some sites, but use the default site’s menu for others You can even use different approaches for different flat menus in the same project. If you’d like to learn more, take a look at the fall_back_to_default_site_menus option in Supported arguments

      Usar main menu o flat menu en wagtail

    3. Have you noticed how the aricle pages are not shown below the ‘Latest news’ item, despite specifying allow_subnav=True on the menu item? Only pages with a show_in_menus value of True will be displayed (at any level) in rendered menus. The field is added by Wagtail, so is present for all custom page types. For page types that are better suited to showing on listing/index pages (for example: news articles or events) - you can set the show_in_menus_default attribute on the page type class to False to exclude them from menus by default.

      Configuraciones básicas de wagtailmenus para que se muestren o no

    1. indent=True here is treated as indent=1, so it works, but I’m pretty sure nobody would intend that to mean an indent of 1 space
    2. bool is actually not a primitive data type — it’s actually a subclass of int!

      Python has only 5 primitives

    3. complex is a supertype of float, which, in turn, is a supertype of int.

      On some of Python's primitives

    4. Now since the “compiling to bytecode” step above takes a noticeable amount of time when you import a module, Python stores (marshalls) the bytecode into a .pyc file, and stores it in a folder called __pycache__. The __cached__ parameter of the imported module then points to this .pyc file.When the same module is imported again at a later time, Python checks if a .pyc version of the module exists, and then directly imports the already-compiled version instead, saving a bunch of time and computation.

      Python takes benefit of caching imports

    5. Bytecode is a set of micro-instructions for Python’s virtual machine. This “virtual machine” is where Python’s interpreter logic resides. It essentially emulates a very simple stack-based computer on your machine, in order to execute the Python code written by you.

      What bytecode does

    6. Python is compiled. In fact, all Python code is compiled, but not to machine code — to bytecode

      Python is compiled to bytecode

    7. Python always runs in debug mode by default.The other mode that Python can run in, is “optimized mode”. To run python in “optimized mode”, you can invoke it by passing the -O flag. And all it does, is prevents assert statements from doing anything (at least so far), which in all honesty, isn’t really useful at all.

      Python debug vs optimized mode

    8. np = __import__('numpy') # Same as doing 'import numpy as np'
    9. This refers to the module spec. It contains metadata such as the module name, what kind of module it is, as well as how it was created and loaded.

      __spec__

    10. let’s say you only want to support integer addition with this class, and not floats. This is where you’d use NotImplemented

      Example use case of NotImplemented:

      class MyNumber:
          def __add__(self, other):
              if isinstance(other, float):
                  return NotImplemented
      
              return other + 42