TLA+ is a high-level language for modeling programs and systems--especially concurrent and distributed ones.
Need to look more into TLA+ and formal verification with regards to software development.
TLA+ is a high-level language for modeling programs and systems--especially concurrent and distributed ones.
Need to look more into TLA+ and formal verification with regards to software development.
Essays on programming I think about a lot
Nice collection of programming essays
Introducing Module#const_source_locationUsing Method#source_location made finding the location of any method fairly easy. Unfortunately, there wasn’t an equivalent for constants. This meant that unless the constant you needed to find was defined in your codebase, finding its source location was not easy.
When FrozenError is raised, it is usually difficult to determine from the context which was the frozen object that a modification was attempted on.Ruby 2.7 introduces FrozenError#receiver which will return the frozen object that modification was attempted on, similar to NameError#receiver. This can help pinpoint exactly what is the frozen object.
A beginless range is experimentally introduced. It might not be as useful as an endless range, but would be good for DSL purposes.
dynamic
A dynamic language (Lisp, Perl, Python, Ruby) is designed to optimize programmer efficiency, so you can implement functionality with less code. A static language (C, C++, etc) is designed to optimize hardware efficiency, so that the code you write executes as quickly as possible. https://stackoverflow.com/questions/20563433/difference-between-static-and-dynamic-programming-languages#:~:text=A%20dynamic%20language%20(Lisp%2C%20Perl,executes%20as%20quickly%20as%20possible.
multi-paradigm
Programming paradigms are a way to classify programming languages based on their features - these include imperative, declarative,
https://upload.wikimedia.org/wikipedia/commons/f/f7/Programming_paradigms.svg
https://en.wikipedia.org/wiki/Comparison_of_multi-paradigm_programming_languages
the overloaded operators ¬, =, ≠, and abs are defined
Most of Algol's "special" characters (⊂, ≡, ␣, ×, ÷, ≤, ≥, ≠, ¬, ⊃, ≡, ∨, ∧, →, ↓, ↑, ⌊, ⌈, ⎩, ⎧, ⊥, ⏨, ¢, ○ and □) can be found on the IBM 2741 keyboard with the APL "golf-ball" print head inserted; these became available in the mid-1960s while ALGOL 68 was being drafted. These characters are also part of the Unicode standard and most of them are available in several popular fonts.
How to Find Programmers For a Startup and a Company
Learn how to find programmers by reading this article.
Made analogy with internal combustion engine, which has 1000s of parts, with the "radical simplicity" approach taken by Tesla: they use an electric motor, which only has 2 components!
comparison: Sapper vs. Gatsby
Typical software requirements specify the following
Android is an operating system based on Linux with a Java programming interface for mobile devices such as Smartphone (Touch Screen Devices who supports Android OS) as well for Tablets too.
Android is an operating system based on Linux with a Java programming interface for mobile devices such as Smartphone (Touch Screen Devices who supports Android OS) as well for Tablets too.
To learn more about android visit Android Tutorial
Bootstrap is an open-source HTML, CSS, and JavaScript framework for building responsive and mobile-first applications on the web.
Bootstrap is an open-source HTML, CSS, and JavaScript framework for building responsive and mobile-first applications on the web. To learn more about bootstrap visit Bootstrap Tutorial
LINQ means Language Integrated Query and it was introduced in .NET Framework 3.5 to query the data from different data sources such as collections, generics, XML Documents, ADO.NET Datasets, SQL, Web Service, etc. in C# and VB.NET.
LINQ means Language Integrated Query and it was introduced in .NET Framework 3.5 to query the data from different data sources such as collections, generics, XML Documents, ADO.NET Datasets, SQL, Web Service, etc. in C# and VB.NET. To learn more about LINQ visit LINQ Tutorial
Visual Basic (VB) is an object-oriented programming language and that enables the developers to build a variety of secure and robust applications that run on the .NET Framework.
Visual Basic (VB) is an object-oriented programming language and that enables the developers to build a variety of secure and robust applications that run on the .NET Framework.
To learn more about visual basic refer Visual Basic (VB.NET) Tutorial
The brain uses the same area to save coding as it does to save our speech. They found that programming is like talking. The research found out that the brain regions that are most active during coding are those that are also relevant in the processing of natural language.
in Python - setting up basic logger is very simple
Apart from printing the result, it is better to debug with logging.
Sample logger:
import logging
logging.basicConfig(
filename='application.log',
level=logging.WARNING,
format= '[%(asctime)s] {%(pathname)s:%(lineno)d} %(levelname)s - %(message)s',
datefmt='%H:%M:%S'
)
logging.error("Some serious error occurred.")
logging.warning('Function you are using is deprecated.')
the sample result:
[12:52:35] {<stdin>:1} ERROR - Some serious error occurred.
[12:52:35] {<stdin>:1} WARNING - Function you are using is deprecated.
to find its location, type:
logging.getLoggerClass().root.handlers[0].baseFilename
- High-level modules should not depend on low-level modules. Both should depend on the abstraction.- Abstractions should not depend on details. Details should depend on abstractions.
SOLI(D)
Dependency Inversion

Clients should not be forced to depend on methods that they do not use.
SOL(I)D
Interface Segregation

If S is a subtype of T, then objects of type T in a program may be replaced with objects of type S without altering any of the desirable properties of that program.
SO(L)ID
Liskov Substitution

Classes should be open for extension, but closed for modification
S(O)LID
Open-Closed

A class should have a single responsibility
(S)OLID
Single Responsibility

Peikert, A., & Brandmaier, A. M. (2019). A Reproducible Data Analysis Workflow with R Markdown, Git, Make, and Docker. https://doi.org/10.31234/osf.io/8xzqy
This tightly controlled build environment is sometimes called a "holy build box". The Traveling Ruby project provides such a holy build box.
Don’t go to code academy, go to design academy. Be advocates of the user & consumer. It’s not about learning how to code, it’s about translating real-world needs to technological specifications in just ways that give end users agency and equity in design, development and delivery. Be a champion of user-centric design. Learn how to steward data and offer your help.
The importance of learning to design, and interpreting/translating real-world needs.
bookmark
function (or in the case of type classes, we call these methods)
If you can tell that two types are equal or not equal, that type belongs in the Eq type class.
two types or rather two things of the same type?
Mehrotra, S., Rahimian, H., Barah, M., Luo, F., & Schantz, K. (2020 May 02). A model of supply-chain decisions for resource sharing with an application to ventilator allocation to combat COVID-19. Naval Research Logistics (NRL). https://doi.org/10.1002/nav.21905
Continuous Delivery of Deployment is about running as thorough checks as you can to catch issues on your code. Completeness of the checks is the most important factor. It is usually measured in terms code coverage or functional coverage of your tests. Catching errors early on prevents broken code to get deployed to any environment and saves the precious time of your test team.
Continuous Delivery of Deployment (quick summary)
Continuous Integration is a trade off between speed of feedback loop to developers and relevance of the checks your perform (build and test). No code that would impede the team progress should make it to the main branch.
Continuous Integration (quick summary)
A good CD build: Ensures that as many features as possible are working properly The faster the better, but it is not a matter of speed. A 30-60 minutes build is OK
Good CD build
A good CI build: Ensures no code that breaks basic stuff and prevents other team members to work is introduced to the main branch Is fast enough to provide feedback to developers within minutes to prevent context switching between tasks
Good CI build
The idea of Continuous Delivery is to prepare artefacts as close as possible from what you want to run in your environment. These can be jar or war files if you are working with Java, executables if you are working with .NET. These can also be folders of transpiled JS code or even Docker containers, whatever makes deploy shorter (i.e. you have pre built in advance as much as you can).
Idea of Continuous Delivery
Continuous Delivery is about being able to deploy any version of your code at all times. In practice it means the last or pre last version of your code.
Continous Delivery
Continuous Integration is not about tools. It is about working in small chunks and integrating your new code to the main branch and pulling frequently.
Continuous Integration is not about tools
The app should build and start Most critical features should be functional at all times (user signup/login journey and key business features) Common layers of the application that all the developers rely on, should be stable. This means unit tests on those parts.
Things to be checked by Continous Integration
Continuous Integration is all about preventing the main branch of being broken so your team is not stuck. That’s it. It is not about having all your tests green all the time and the main branch deployable to production at every commit.
Continuous Integration prevents other team members from wasting time through a pull of faulty code
// ES5-compatible code var myObject = { prop1: 'hello', prop2: 'world', output: function() { console.log(this.prop1 + ' ' + this.prop2); } }; myObject.output(); // hello world
Creating an object.
Scrum means that “you have to get certain things done with those two weeks.” Kanban means “do what you can do in two weeks.”
If you get a choice, push for Kanban over Scrum
What people will say is that estimates are for planning – that their purpose is to figure out how long some piece of work is going to take, so that everybody can plan accordingly. In all my five years shipping stuff, I can only recall one project where things really worked that way.
Project estimations are just energy drainers and stress producers
Be explicit about the difference between hard deadlines
Different types of deadlines:
If you delegate all your IT security to the InfoSec, they will come up with draconian rules
Try to do some of your own security before delegating everything to InfoSec that will come with draconian restrictions
you should always advocate for having a dedicated SRE if there’s any real risk of after-hours pages that are out of your control.
Site Reliability Engineers (ideally from different time zones) should've been settled when we might expect after-hours errors
I try to write a unit test any time the expected value of a defect is non-trivial.
Write unit tests at least for the most important parts of code, but every chunk of code should have a trivial unit test around it – this verifies that the code is written in a testable way, which indeed is extremely important
I’m defining an integration test as a test where you’re calling code that you don’t own
When to write integration tests:
Which database technology to choose
Which database to choose (advice from an Amazon employee):
I would use a serverless function when I have a relatively small and simple chunk of code that needs to run every once in a while.
When to make a serverless function (advice from an Amazon employee)
Programming languages These will probably expose my ignorance pretty nicely.
When to use different programming languages (advice from an Amazon employee):
A few takeaways
Summarising the article:
I set it with a few clicks at Travis CI, and by creating a .travis.yml file in the repo
You can set CI with a few clicks using Travis CI and creating a .travis.yml file in your repo:
language: node_js
node_js: node
before_script:
- npm install -g typescript
- npm install codecov -g
script:
- yarn lint
- yarn build
- yarn test
- yarn build-docs
after_success:
- codecov
I set it with a few clicks at Travis CI, and by creating a .travis.yml file in the repo
You can set CI with a few clicks using Travis CI and creating a .travis.yml file in your repo:
language: node_js
node_js: node
before_script:
- npm install -g typescript
- npm install codecov -g
script:
- yarn lint
- yarn build
- yarn test
- yarn build-docs
after_success:
- codecov
Continuous integration makes it easy to check against cases when the code: does not work (but someone didn’t test it and pushed haphazardly), does work only locally, as it is based on local installations, does work only locally, as not all files were committed.
CI - Continuous Integration helps to check the code when it :
In Python, when trying to do a dubious operation, you get an error pretty soon. In JavaScript… an undefined can fly through a few layers of abstraction, causing an error in a seemingly unrelated piece of code.
Undefined nature of JavaScript can hide an error for a long time. For example,
function add(a,b) { return + (a + b) }
add(2,2)
add('2', 2)
will result in a number, but is it the same one?
With Codecov it is easy to make jest & Travis CI generate one more thing:
Codecov lets you generate a score on your tests:

I would use ESLint in full strength, tests for some (especially end-to-end, to make sure a commit does not make project crash), and add continuous integration.
Advantage of tests
It is fine to start adding tests gradually, by adding a few tests to things that are the most difficult (ones you need to keep fingers crossed so they work) or most critical (simple but with many other dependent components).
Start small by adding tests to the most crucial parts
I found that the overhead to use types in TypeScript is minimal (if any).
In TypeScript, unlike in JS we need to specify the types:

I need to specify types of input and output. But then I get speedup due to autocompletion, hints, and linting if for any reason I make a mistake.
In TypeScript, you spend a bit more time in the variable definition, but then autocompletion, hints, and linting will reward you. It also boosts code readability
TSDoc is a way of writing TypeScript comments where they’re linked to a particular function, class or method (like Python docstrings).
TSDoc <--- TypeScript comment syntax. You can create documentation with TypeDoc
ESLint does automatic code linting
ESLint <--- pluggable JS linter:
if (x = 5) { ... })Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
According to the Kernighan's Law, writing code is not as hard as debugging
Write a new test and the result. If you want to make it REPL-like, instead of writing console.log(x.toString()) use expect(x.toString()).toBe('') and you will directly get the result.
jest <--- interactive JavaScript (TypeScript and others too) testing framework. You can use it as a VS Code extension.
Basically, instead of console.log(x.toString()), you can use except(x.toString()).toBe(''). Check this gif to understand it further
interactive notebooks fall short when you want to write bigger, maintainable code
Survey regarding programming notebooks:

I recommend Airbnb style JavaScript style guide and Airbnb TypeScript)
Recommended style guides from Airbnb for:
Creating meticulous tests before exploring the data is a big mistake, and will result in a well-crafted garbage-in, garbage-out pipeline. We need an environment flexible enough to encourage experiments, especially in the initial place.
Overzealous nature of TDD may discourage from explorable data science
The programming language is augmented with natural language description details, where convenient, or with compact mathematical notation.
There are many types of CRDTs
CRDTs have different types, such as Grow-only set and Last-writer-wins register. Check more of them here
Some of our main takeaways:CRDT literature can be relevant even if you're not creating a decentralized systemMultiplayer for a visual editor like ours wasn't as intimidating as we thoughtTaking time to research and prototype in the beginning really paid off
Key takeaways of developing a live editing tool
traditional approaches that informed ours — OTs and CRDTs
Traditional approaches of the multiplayer technology
CRDTs refer to a collection of different data structures commonly used in distributed systems. All CRDTs satisfy certain mathematical properties which guarantee eventual consistency. If no more updates are made, eventually everyone accessing the data structure will see the same thing. This constraint is required for correctness; we cannot allow two clients editing the same Figma document to diverge and never converge again
CRDTs (Conflict-free Replicated Data Types)
They’re a great way of editing long text documents with low memory and performance overhead, but they are very complicated and hard to implement correctly
Characteristics of OTs
Even if you have a client-server setup, CRDTs are still worth researching because they provide a well-studied, solid foundation to start with
CRDTs are worth studying for a good foundation
Figma’s multiplayer servers keep track of the latest value that any client has sent for a given property on a given object
✅ No conflict:
❎ Conflict:
Figma doesn’t store any properties of deleted objects on the server. That data is instead stored in the undo buffer of the client that performed the delete. If that client wants to undo the delete, then it’s also responsible for restoring all properties of the deleted objects. This helps keep long-lived documents from continuing to grow in size as they are edited
Undo option
it's important to be able to iterate quickly and experiment before committing to an approach. That's why we first created a prototype environment to test our ideas instead of working in the real codebase
First work with a prototype, then the real codebase
Designers worried that live collaborative editing would result in “hovering art directors” and “design by committee” catastrophes.
Worries of using a live collaborative editing
We had a lot of trouble until we settled on a principle to help guide us: if you undo a lot, copy something, and redo back to the present (a common operation), the document should not change. This may seem obvious but the single-player implementation of redo means “put back what I did” which may end up overwriting what other people did next if you’re not careful. This is why in Figma an undo operation modifies redo history at the time of the undo, and likewise a redo operation modifies undo history at the time of the redo
Undo/Redo working
operational transforms (a.k.a. OTs), the standard multiplayer algorithm popularized by apps like Google Docs. As a startup we value the ability to ship features quickly, and OTs were unnecessarily complex for our problem space
Operational Transforms (OT) are unnecessarily complex for problems unlike Google Docs
Every Figma document is a tree of objects, similar to the HTML DOM. There is a single root object that represents the entire document. Underneath the root object are page objects, and underneath each page object is a hierarchy of objects representing the contents of the page. This tree is is presented in the layers panel on the left-hand side of the Figma editor.
Structure of Figma documents
When a document is opened, the client starts by downloading a copy of the file. From that point on, updates to that document in both directions are synced over the WebSocket connection. Figma lets you go offline for an arbitrary amount of time and continue editing. When you come back online, the client downloads a fresh copy of the document, reapplies any offline edits on top of this latest state, and then continues syncing updates over a new WebSocket connection
Offline editing isn't a problem, unlike the online one
An important consequence of this is that changes are atomic at the property value boundary. The eventually consistent value for a given property is always a value sent by one of the clients. This is why simultaneous editing of the same text value doesn’t work in Figma. If the text value is B and someone changes it to AB at the same time as someone else changes it to BC, the end result will be either AB or BC but never ABC
Consequence of approaches like last-writer-wins
We use a client/server architecture where Figma clients are web pages that talk with a cluster of servers over WebSockets. Our servers currently spin up a separate process for each multiplayer document which everyone editing that document connects to
Way Figma approaches client/server architecture
CRDTs are designed for decentralized systems where there is no single central authority to decide what the final state should be. There is some unavoidable performance and memory overhead with doing this. Since Figma is centralized (our server is the central authority), we can simplify our system by removing this extra overhead and benefit from a faster and leaner implementation
CRDTs are designed for decentralized systems
Sometimes it's interesting to explain some code (How many time you spend trying to figure out a regex pattern when you see one?), but, in 99% of the time, comments could be avoided.
Generally try to avoid (avoid != forbid) comments.
Comments:
When we talk about abstraction levels, we can classify the code in 3 levels: high: getAdress medium: inactiveUsers = Users.findInactives low: .split(" ")
3 abstraction levels:
getAdressinactiveUsers = Users.findInactives.split(" ")Explanation:
searchForsomething()account.unverifyAccountmap, to_downncase and so on The ideal is not to mix the abstraction levels in only one function.
Try not mixing abstraction levels inside a single function
There is another maxim also that says: you must write the same code a maximum of 3 times. The third time you should consider refactoring and reducing duplication
Avoid repeating the same code over and over
Should be nouns, and not verbs, because classes represent concrete objects
Class names = nouns
Uncle Bob, in clean code, defends that the best order to write code is: Write unit tests. Create code that works. Refactor to clean the code.
Best order to write code (according to Uncle Bob):
int d could be int days
When naming things, focus on giving meaningful names, that you can pronounce and are searchable. Also, avoid prefixes
naming things, write better functions and a little about comments. Next, I intend to talk about formatting, objects and data structures, how to handle with errors, about boundaries (how to deal with another's one code), unit testing and how to organize your class better. I know that it'll be missing an important topic about code smells
Ideas to consider while developing clean code:
Should be verbs, and not nouns, because methods represent actions that objects must do
Methods names = verbs
decrease the switch/if/else is to use polymorphism
It's better to avoid excessive switch/if/else statements

In the ideal world, they should be 1 or 2 levels of indentation
Functions in the ideal world shouldn't be long
"The Big Picture" is one of those things that people say a whole lot but can mean so many different things. Going through all of these articles, they tend to mean any (or all) of these things
Thinking about The Big Picture:
Considering that there are still a ton of COBOL jobs out there, there is no particular technology that you need to know
RIght, there is no specific need to learn that one technology
read Knuth, or Pragmatic Programming, or Clean Code, or some other popular book
Classic programming related books
Senior developers are more cautious, thoughtful, pragmatic, practical and simple in their approaches to solving problems.
Interesting definition of senior devs
In recent years we’ve also begun to see increasing interest in exploratory testing as an important part of the agile toolbox
Waterfall software development ---> agile ---> exploratory testing
When I began coding, around 30 years ago, waterfall software development was used nearly exclusively.
Mathematica didn’t really help me build anything useful, because I couldn’t distribute my code or applications to colleagues (unless they spent thousands of dollars for a Mathematica license to use it), and I couldn’t easily create web applications for people to access from the browser. In addition, I found my Mathematica code would often end up much slower and more memory hungry than code I wrote in other languages.
Disadvantages of Mathematica:
In the 1990s, however, things started to change. Agile development became popular. People started to understand the reality that most software development is an iterative process
a methodology that combines a programming language with a documentation language, thereby making programs more robust, more portable, more easily maintained, and arguably more fun to write than programs that are written only in a high-level language. The main idea is to treat a program as a piece of literature, addressed to human beings rather than to a computer.
Exploratory testing described by Donald Knuth
Development Pros Cons
Table comparing pros and cons of:
This kind of “exploring” is easiest when you develop on the prompt (or REPL), or using a notebook-oriented development system like Jupyter Notebooks
It's easier to explore the code:
but, it's not efficient to develop in them
notebook contains an actual running Python interpreter instance that you’re fully in control of. So Jupyter can provide auto-completions, parameter lists, and context-sensitive documentation based on the actual state of your code
Notebook makes it easier to handle dynamic Python features
They switch to get features like good doc lookup, good syntax highlighting, integration with unit tests, and (critically!) the ability to produce final, distributable source code files, as opposed to notebooks or REPL histories
Things missed in Jupyter Notebooks:
Exploratory programming is based on the observation that most of us spend most of our time as coders exploring and experimenting
In exploratory programming, we:
Developing in the cloud
Well paid cloud platforms:
Finding a database management system that works for you
Well paid database technologies:
Here are a few very prominent technologies that you can look into and what impact each one might have on your salary
Other well paid frameworks, libraries and tools:
What programming language should I learn next?
Most paid programming languages:
Android and iOS
Payment for mobile OS:
Frontend Devs: What should I learn after JavaScript? Explore these frameworks and libraries
Most paid JS frameworks and libraries:
First, you’ve spread the logic across a variety of different systems, so it becomes more difficult to reason about the application as a whole. Second, more importantly, the logic has been implemented as configuration as opposed to code. The logic is constrained by the ability of the applications which have been wired together, but it’s still there.
Why "no code" trend is dangerous in some way (on the example of Zapier):
the developer doesn’t need to worry about allocating memory, or the character set encoding of the string, or a host of other things.
Comparison of C (1972) and TypeScript (2012) code.
(check the code above)
“No Code” systems are extremely good for putting together proofs-of-concept which can demonstrate the value of moving forward with development.
Great point of "no code" trend
With someone else’s platform, you often end up needing to construct elaborate work-arounds for missing functionality, or indeed cannot implement a required feature at all.
You can quickly implement 80% of the solution in Salesforce using a mix of visual programming (basic rule setting and configuration), but later it's not so straightforward to add the missing 20%
Summary
In doing a code review, you should make sure that:
"Continuous Delivery is the ability to get changes of all types — including new features, configuration changes, bug fixes, and experiments — into production, or into the hands of users, safely and quickly in a sustainable way". -- Jez Humble and Dave Farley
Continuous Delivery
Another approach is to use a tool like H2O to export the model as a POJO in a JAR Java library, which you can then add as a dependency in your application. The benefit of this approach is that you can train the models in a language familiar to Data Scientists, such as Python or R, and export the model as a compiled binary that runs in a different target environment (JVM), which can be faster at inference time
H2O - export models trained in Python/R as a POJO in JAR
Continuous Delivery for Machine Learning (CD4ML) is a software engineering approach in which a cross-functional team produces machine learning applications based on code, data, and models in small and safe increments that can be reproduced and reliably released at any time, in short adaptation cycles.
Continuous Delivery for Machine Learning (CD4ML) (long definition)
Basic principles:
In order to formalise the model training process in code, we used an open source tool called DVC (Data Science Version Control). It provides similar semantics to Git, but also solves a few ML-specific problems:
DVC - transform model training process into code.
Advantages:
Machine Learning pipeline for our Sales Forecasting problem, and the 3 steps to automate it with DVC
Sales Forecasting process

Continuous Delivery for Machine Learning end-to-end process

common functional silos in large organizations can create barriers, stifling the ability to automate the end-to-end process of deploying ML applications to production
Common ML process (leading to delays and frictions)

There are different types of testing that can be introduced in the ML workflow.
Automated tests for ML system:
example of how to combine different test pyramids for data, model, and code in CD4ML
Combining tests for data (purple), model (green) and code (blue)

A deployment pipeline automates the process for getting software from version control into production, including all the stages, approvals, testing, and deployment to different environments
Deployment pipeline
We chose to use GoCD as our Continuous Delivery tool, as it was built with the concept of pipelines as a first-class concern
GoCD - open source Continuous Delivery tool
Continuous Delivery for Machine Learning (CD4ML) is the discipline of bringing Continuous Delivery principles and practices to Machine Learning applications.
Continuous Delivery for Machine Learning (CD4ML)
Sometimes, the best way to learn is to mimic others. Here are some great examples of projects that use documentation well:
Examples of projects that use documentation well
(chech the list below)
“Code is more often read than written.” — Guido van Rossum
Documenting code is describing its use and functionality to your users. While it may be helpful in the development process, the main intended audience is the users.
Documenting code:
Class method docstrings should contain the following: A brief description of what the method is and what it’s used for Any arguments (both required and optional) that are passed including keyword arguments Label any arguments that are considered optional or have a default value Any side effects that occur when executing the method Any exceptions that are raised Any restrictions on when the method can be called
Class method should contain:
(check example below)
Comments to your code should be kept brief and focused. Avoid using long comments when possible. Additionally, you should use the following four essential rules as suggested by Jeff Atwood:
Comments should be as concise as possible. Moreover, you should follow 4 rules of Jeff Atwood:
From examining the type hinting, you can immediately tell that the function expects the input name to be of a type str, or string. You can also tell that the expected output of the function will be of a type str, or string, as well.
Type hinting introduced in Python 3.5 extends 4 rules of Jeff Atwood and comments the code itself, such as this example:
def hello_name(name: str) -> str:
return(f"Hello {name}")
strDocstrings can be further broken up into three major categories: Class Docstrings: Class and class methods Package and Module Docstrings: Package, modules, and functions Script Docstrings: Script and functions
3 main categories of docstrings
According to PEP 8, comments should have a maximum length of 72 characters.
If comment_size > 72 characters:
use `multiple line comment`
Docstring conventions are described within PEP 257. Their purpose is to provide your users with a brief overview of the object.
Docstring conventions
All multi-lined docstrings have the following parts: A one-line summary line A blank line proceeding the summary Any further elaboration for the docstring Another blank line
Multi-line docstring example:
"""This is the summary line
This is the further elaboration of the docstring. Within this section,
you can elaborate further on details as appropriate for the situation.
Notice that the summary and the elaboration is separated by a blank new
line.
# Notice the blank line above. Code should continue on this line.
say_hello.__doc__ = "A simple function that says hello... Richie style"
Example of using __doc:
Code (version 1):
def say_hello(name):
print(f"Hello {name}, is it me you're looking for?")
say_hello.__doc__ = "A simple function that says hello... Richie style"
Code (alternative version):
def say_hello(name):
"""A simple function that says hello... Richie style"""
print(f"Hello {name}, is it me you're looking for?")
Input:
>>> help(say_hello)
Returns:
Help on function say_hello in module __main__:
say_hello(name)
A simple function that says hello... Richie style
class constructor parameters should be documented within the __init__ class method docstring
init
Scripts are considered to be single file executables run from the console. Docstrings for scripts are placed at the top of the file and should be documented well enough for users to be able to have a sufficient understanding of how to use the script.
Docstrings in scripts
Documenting your code, especially large projects, can be daunting. Thankfully there are some tools out and references to get you started
You can always facilitate documentation with tools.
(check the table below)
Commenting your code serves multiple purposes
Multiple purposes of commenting:
BUG, FIXME, TODOIn general, commenting is describing your code to/for developers. The intended main audience is the maintainers and developers of the Python code. In conjunction with well-written code, comments help to guide the reader to better understand your code and its purpose and design
Commenting code:
Along with these tools, there are some additional tutorials, videos, and articles that can be useful when you are documenting your project
Recommended videos to start documenting
(check the list below)
If you use argparse, then you can omit parameter-specific documentation, assuming it’s correctly been documented within the help parameter of the argparser.parser.add_argument function. It is recommended to use the __doc__ for the description parameter within argparse.ArgumentParser’s constructor.
argparse
There are specific docstrings formats that can be used to help docstring parsers and users have a familiar and known format.
Different docstring formats:
Daniele Procida gave a wonderful PyCon 2017 talk and subsequent blog post about documenting Python projects. He mentions that all projects should have the following four major sections to help you focus your work:
Public and Open Source Python projects should have the docs folder, and inside of it:
(check the table below for a summary)
Since everything in Python is an object, you can examine the directory of the object using the dir() command
dir() function examines directory of Python objects. For example dir(str).
Inside dir(str) you can find interesting property __doc__
Documenting your Python code is all centered on docstrings. These are built-in strings that, when configured correctly, can help your users and yourself with your project’s documentation.
Docstrings - built-in strings that help with documentation
Along with docstrings, Python also has the built-in function help() that prints out the objects docstring to the console.
help() function.
After typing help(str) it will return all the info about str object
The general layout of the project and its documentation should be as follows:
project_root/
│
├── project/ # Project source code
├── docs/
├── README
├── HOW_TO_CONTRIBUTE
├── CODE_OF_CONDUCT
├── examples.py
(private, shared or open sourced)
In all cases, the docstrings should use the triple-double quote (""") string format.
Think only about """ when using docstrings
Each format makes tradeoffs in encoding, flexibility, and expressiveness to best suit a specific use case.
Each data format brings different tradeoffs:
Computers can only natively store integers, so they need some way of representing decimal numbers. This representation comes with some degree of inaccuracy. That's why, more often than not, .1 + .2 != .3
Computers make up their way to store decimal numbers
Cross-platform development is now a standard because of wide variety of architectures like mobile devices, cloud servers, embedded IoT systems. It was almost exclusively PCs 20 years ago.
A package management ecosystem is essential for programming languages now. People simply don’t want to go through the hassle of finding, downloading and installing libraries anymore. 20 years ago we used to visit web sites, downloaded zip files, copied them to correct locations, added them to the paths in the build configuration and prayed that they worked.
How library management changed in 20 years
IDEs and the programming languages are getting more and more distant from each other. 20 years ago an IDE was specifically developed for a single language, like Eclipse for Java, Visual Basic, Delphi for Pascal etc. Now, we have text editors like VS Code that can support any programming language with IDE like features.
How IDEs "unified" in comparison to the last 20 years
Your project has no business value today unless it includes blockchain and AI, although a centralized and rule-based version would be much faster and more efficient.
Comparing current project needs to those 20 years ago
Being a software development team now involves all team members performing a mysterious ritual of standing up together for 15 minutes in the morning and drawing occult symbols with post-its.
In comparison to 20 years ago ;)
Language tooling is richer today. A programming language was usually a compiler and perhaps a debugger. Today, they usually come with the linter, source code formatter, template creators, self-update ability and a list of arguments that you can use in a debate against the competing language.
How coding became much more supported in comparison to the last 20 years
There is StackOverflow which simply didn’t exist back then. Asking a programming question involved talking to your colleagues.
20 years ago StackOverflow wouldn't give you a hand
Since we have much faster CPUs now, numerical calculations are done in Python which is much slower than Fortran. So numerical calculations basically take the same amount of time as they did 20 years ago.
Python vs Fortran ;)
I am not sure how but one kind soul somehow found the project, forked it, refactored it, "modernized" it, added linting, code sniffing, added CI and opened the pull request.
It's worth sharing your code, since someone can always find it and improve it, so that you can learn from it
It is solved when you understand why it occurred and why it no longer does.
What does it mean for a problem to be solved?
Let's reason through our memoizer before we write any code.
Operations performed by a memoizer:
Which is written as:
// Takes a reference to a function
const memoize = func => {
// Creates a cache of results
const results = {};
// Returns a function
return (...args) => {
// Create a key for results cache
const argsKey = JSON.stringify(args);
// Only execute func if no cached value
if (!results[argsKey]) {
// Store function call result in cache
results[argsKey] = func(...args);
}
// Return cached value
return results[argsKey];
};
};
Let's replicate our inefficientSquare example, but this time we'll use our memoizer to cache results.
Replication of a function with the use of memoizer (check the code below this annotation)
The biggest problem with JSON.stringify is that it doesn't serialize certain inputs, like functions and Symbols (and anything you wouldn't find in JSON).
Problem with JSON.stringify.
This is why the previous code shouldn't be used in production
Memoization is an optimization technique used in many programming languages to reduce the number of redundant, expensive function calls. This is done by caching the return value of a function based on its inputs.
Memoization (simple definition)
The best way to explain the difference between launch and attach is to think of a launch configuration as a recipe for how to start your app in debug mode before VS Code attaches to it, while an attach configuration is a recipe for how to connect VS Code's debugger to an app or process that's already running.
Simple difference between two core debugging modes: Launch and Attach available in VS Code.
Depending on the request (attach or launch), different attributes are required, and VS Code's launch.json validation and suggestions should help with that.
Logpoint is a variant of a breakpoint that does not "break" into the debugger but instead logs a message to the console. Logpoints are especially useful for injecting logging while debugging production servers that cannot be paused or stopped. A Logpoint is represented by a "diamond" shaped icon. Log messages are plain text but can include expressions to be evaluated within curly braces ('{}').
Logpoints - log messages to the console when breakpoint is hit.
Can include expressions to be evaluated with {}, e.g.:
fib({num}): {result}
Here are some optional attributes available to all launch configurations
Optional arguments for launch.json:
presentation ("order", "group" or "hidden")preLaunchTaskpostDebugTaskinternalConsoleOptionsdebugServerserverReadyActionThe following attributes are mandatory for every launch configuration
In the launch.json file you've to define at least those 3 variables:
type (e.g. "node", "php", "go")request ("launch" or "attach")name (name to appear in the Debug launch configuration drop-down)Many debuggers support some of the following attributes
Some of the possibly supported attributes in launch.json:
programargsenvcwdportstopOnEntryconsole (e.g. "internalConsole", "integratedTerminal", "externalTerminal")Version control is at the heart of any modern engineering org. The ability for multiple engineers to asynchronously contribute to a codebase is crucial—and with notebooks, it’s very hard.
Version control in notebooks?
The priorities in building a production machine learning pipeline—the series of steps that take you from raw data to product—are not fundamentally different from those of general software engineering.
Reproducibility is an issue with notebooks. Because of the hidden state and the potential for arbitrary execution order, generating a result in a notebook isn’t always as simple as clicking “Run All.”
Problem of reproducibility in notebooks
A notebook, at a very basic level, is just a bunch of JSON that references blocks of code and the order in which they should be executed.But notebooks prioritize presentation and interactivity at the expense of reproducibility. YAML is the other side of that coin, ignoring presentation in favor of simplicity and reproducibility—making it much better for production.
Summary of the article:
Notebook = presentation + interactivity
YAML = simplicity + reproducibility
Notebook files, however, are essentially giant JSON documents that contain the base-64 encoding of images and binary data. For a complex notebook, it would be extremely hard for anyone to read through a plaintext diff and draw meaningful conclusions—a lot of it would just be rearranged JSON and unintelligible blocks of base-64.
Git traces plaintext differences and with notebooks it's a problem
There is no hidden state or arbitrary execution order in a YAML file, and any changes you make to it can easily be tracked by Git
In comparison to notebooks, YAML is more compatible for Git and in the end might be a better solution for ML
Python unit testing libraries, like unittest, can be used within a notebook, but standard CI/CD tooling has trouble dealing with notebooks for the same reasons that notebook diffs are hard to read.
unittest Python library doesn't work well in a notebook
Use camelCase when naming objects, functions, and instances.
camelCase for objects, functions and instances
const thisIsMyFuction() {}
Use PascalCase only when naming constructors or classes.
PascalCase for constructors and classes
// good
class User {
constructor(options) {
this.name = options.name;
}
}
const good = new User({
name: 'yup',
});
Use uppercase only in constants.
Uppercase for constants
export const API_KEY = 'SOMEKEY';
If you'd just like to see refactorings without Quick Fixes, you can use the Refactor command (Ctrl+Shift+R).
To easily see all the refactoring options, use the "Refactor" command
Stary, dobry Uncle Bob mówi, że poza etatem trzeba na programowanie poświęcić 20h tygodniowo.Gdy podzielimy to na 7 dni w tygodniu, to wychodzi prawie 3 godziny dziennie.Dla jednych mało, dla innych dużo.
Uncle Bob's advice: ~ 3h/day for programming
University of Amsterdam scientists launch website that seeks ideal COVID-19 exit strategy. (2020 April 21) Science|Business. https://sciencebusiness.net/network-updates/university-amsterdam-scientists-launch-website-seeks-ideal-covid-19-exit-strategy
Guido Salvaneschi on Twitter referencing thread by Neil Ferguson
This type relation is sometimes written S <: T
subtyping allows a function to be written to take an object of a certain type T, but also work correctly, if passed an object that belongs to a type S that is a subtype of T (according to the Liskov substitution principle)
chose the term ad hoc polymorphism to refer to polymorphic functions that can be applied to arguments of different types, but that behave differently depending on the type of the argument to which they are applied (also known as function overloading or operator overloading)
I strongly suggest to anyone who wants to become a developer that they do it as well. I mean, it's really easy to see all the work that's out there, and all the things that are left to learn, and think that it's just way beyond you. But when you write it down, you have a place that you can go back to, and not only have I been able to help other people with my blog posts, but I help myself. I'm constantly Googling something and getting my own website in response, and like, oh yeah, I remember I did that before.