programming-guidelines

programming-guidelines

全面编程实践指南助力开发者提升技能

该项目提供全面的编程指南,内容涵盖数据结构、开发实践、远程API等多个方面。汇集了作者多年积累的经验教训,帮助开发者规避常见问题,优化代码质量。指南强调代码简洁性、可读性和数据结构设计,并就数据库使用、性能优化和团队协作给出实用建议。适合各层级开发人员参考学习,提升编程技能。

编程指南数据结构SQLPostgreSQLKISS原则Github开源项目

Programming Guidelines

My opinionated programming guidelines.

1. Introduction

2. Data structures

3. Dev

4. Remote APIs

5. Op

6. Networking

7. Monitoring

8. Communication with others

9. Epilog

1. Introduction

About this README

I was born in 1976. I started coding with basic and assembler when I was 13. Later turbo pascal. From 1996-2001 I studied computer science at HTW-Dresden (Germany). I learned Shell, Perl, Prolog, C, C++, Java, PHP, and finally Python.

Sometimes I see young and talented programmers wasting time. There are two ways to learn: Make mistakes yourself, or read from the mistakes which were done by other people.

This list summarises a lot of mistakes I did in the past. I wrote it, to help you, to avoid these mistakes.

It's my personal opinion and feeling. No facts, no single truth.

I need your feedback

If you have a general question, please start a new discussion.

If you think something is wrong or missing, feel free to open an issue or pull request.

Relaxed focus on your monitor

Do not look at the keyboard while you type. Have a relaxed focus on your monitor.

I type with ten fingers. It's like flying if you learned it. Your eyes can stay on the rubbish you type, and you don't need to move your eyes down (to keyboard) and up (to monitor) several hundred times per day. This saves a lot of energy. This is a simple tool to help you to learn touch typing: tipp10

Measure your typing speed: 10fastfingers.com

Avoid switching between mouse and keyboard too much.

I like Lenovo keyboards with track point. If you want more grip, then read Desktop Tips "Keyboard"

Once I was fascinated by the copy+paste history of Emacs and PyCharm. But then I thought to myself: "I want more. I am hungry. I want a copy+paste history not only in one application, but I also want it for the whole desktop". The solution is very simple, but somehow only a few people use it. The solution is called a clipboard manager. I use CopyQ. I use ctrl+alt+v to open the list of last copy+paste texts. CopyQ supports regex searches in the history.

Avoid searching with your eyes

Avoid searching with your eyes. Search with the tools of your IDE. You should be able to use it "blind". You should be able to move the cursor to the matching position in your code without looking at your keyboard, without grabbing your mouse/touchpad/TrackPoint and without looking up/down on your screen.

Compare two files with a diff tool, otherwise, you might get this ugly skeptical frown.

How often per day do you search for the mouse cursor on your screen? Support your eyes by increasing the cursor size. If you use Ubuntu, you can do it via Universal Access / Cursor Size

Increase font size

During daily work, you often jump from one information snippet to the next information snippet.

When was the last time you read a text with more than 20 sentences?

I think from time to time you should do so. Slow down, focus on one text, and read slowly. It helps to increase the font-size. ctrl-+ is your friend.

KISS

Keep it simple and stupid. The most boring and most obvious solution is often the best. Although it sometimes takes months until you know which solution it is.

From the book "Site Reliability Engineering" (O'Reilly Media 2016) https://landing.google.com/sre/book/chapters/simplicity.html

Quote:

: The Virtue of Boring

Unlike just about everything else in life, "boring" is a
positive attribute when it comes to software! We don’t want our programs to be spontaneous and interesting; we want them to stick to the script and predictably accomplish their business goals.

Example: Pure Functions are great. They are stateless, their output can be cached forever, they are easy to test.

Increase the obviousness

But it is not only about code. It is about the experience of all stakeholders: Users, salespeople, support hotline, developers,...

It is hard work to keep it simple.

One thing I love to do: "Increase the obviousness".

One tool to get there: Use a central wiki (without spaces), and define terms. Related text from me: Documentation in Intranets: My point of view

Avoid redundancy

See heading.

Premature optimization is the root of all evil.

The famous quote "premature optimization is the root of all evil." is true. You can read more about this here When to optimize.

MVP

You should know what an MVP (minimum valuable product) is. Building an MVP means to bring something useable to your customer, and then listen to their feedback. Care for their needs, not for your vision of a super performant application.

Avoid i18n in MVP. German is my mother tongue. If I develop a MVP for German users, than I won't to i18n. This can be done later, if needed.


2. Data structures

Introduction

"Bad programmers worry about the code. Good programmers worry about data structures and their relationships." -- Linus Torvalds (creator and developer of the Linux kernel and the version control system git)

Cache vs Database

There is a fundamental fact which you need to understand: The difference between a cache and a database.

Remember the basic Input-Process-Output pattern.

In a cache you store data which is output. That's handy since you can access the output without doing the processing again. But cache-invalidation is hard. Maybe the input has changed, and the value in the cache is outdated? Who knows? If possible avoid caching, since this will never give you outdated data. You don't need to backup your cache data. You can create it again.

In a database you store data which is input. Usually it was entered by a human by hand, or generated by measuring some real word data. You can use the data in database to create a nice HTML page. It is important to backup your valuable database data, since you can't create it again. The generated output (HTML, JSON, ...) has no value.

Data which is input usualy has value. Data which is output has only little value, since you can re-create it again.

Relational Database

I know SQL is..... It is either obvious or incomprehensible. And, yes, it is boring.

A relational database is a rock-solid data storage. Use it.

When I studied computer science, I disliked SQL. I thought it was an outdated solution. I tried to store data in files in XML format, used in memory Berkley-DB, I used an object-oriented database written in Python (ZODB), I used NoSQL .... And finally, I realized that boring SQL is the best solution for most cases.

I use PostgreSQL.

I don't like NoSQL, except for caching (simple key-value DB).

The PostgreSQL Documentation contains an introduction to SQL and is easy to read.

If you want to share small SQL snippets, you can use https://dbfiddle.uk/

Cardinality

It does not matter how you work with your data (struct in C, classes in OOP, tables in SQL, ...). Cardinality is very important. Using 0..* is often easier to implement than 0..1. The first can be handled by a simple loop. The second is often a nullable column/attribute. You need conditions (IFs) to handle nullable columns/attributes.

https://en.wikipedia.org/wiki/Cardinality_(data_modeling)

If this is new to you, I will give you two examples:

  • 1:N --> One invoice has several invoice positions. For example, you buy three books in one order, the invoice will have three invoice positions. This is a 1:N relationship. The invoice position is contained in exactly one invoice.
  • N:M --> If you look at tags, for example at the Question+Answer site StackOverflow: One question can be related to several tags/topics and of course a topic can be set on several questions. For example, you have a strange UnicodeError in Python then you can set the tags "python" and "unicode" on your question. This is an N:M relationship. One well know example of N:M is user and groups.

Conditionless Data Structures

If you have no conditions in your data structures, then the coding for the input/output of your data will be much easier.

Avoid nullable Foreign Keys

Imagine you have a table "meeting" and a table "place". The table "meeting" has a ForeignKey to table "place". In the beginning, it might be not clear where the meeting will be. Most developers will make the ForeignKey optional (nullable). WAIT: This will create a condition in your data structure. There is a way easier solution: Create a place called "unknown". Use this senitel value as default. This data structure (without a nullable ForeignKey) makes implementing the GUI much easier.

In other words: If there is no NULL in your data, then there will be less NullPointerException in your source code while processing the data :-)

Fewer conditions, fewer bugs.

Avoid nullable boolean columns

[True, False, Unknown] is not a nullable Boolean Column.

If you want to store data in a SQL database that has three states (True, False, Unknown), then you might think a nullable boolean column (here "my_column") is the right choice. But I think it is not. Do you think the SQL statement "select * from my_table where my_column = %s" works? No, it won't work since "select * from my_table where my_column = NULL" will never return a single line. If you don't believe me, read: Effect of NULL in WHERE clauses (Wikipedia). If you like typing, you can work-around this in your application, but I prefer straightforward solutions with only a few conditions.

If you want to store True, False, Unknown: Use text, integer, or a new table and a foreign key.

Avoid nullable characters columns

If you allow NULL in a character column, then you have two ways to express "empty":

  • NULL
  • empty string

Avoid it if possible. In most cases, you just need one variant of "empty". Simplest solution: avoid that a column holding character data is allowed to be null.

If you think the character column should be allowed to be NULL (for example you want a unique, but optional identifier for rows), then consider a constraint: If the character string in the column is not NULL, then the string must not be empty. This way ensure that there are is only one variant of "empty".

SQL: I prefer subqueries to joins

In most cases, I use an ORM to access data and don't write SQL by hand.

If I do write SQL by hand, then I often prefer SQL Subqueries to SQL Joins.

Have a look at this example:

SELECT id, name
FROM products
WHERE category_id IN
   (SELECT id
    FROM categories
    WHERE expired = True)

I can translate this to human language easily: Select all products, which belong to a category that has expired.

Use all features PostgreSQL does offer

If you want to store structured data, then PostgreSQL is a safe default choice. It fits in most cases. Use all features PostgreSQL does offer. Don't constrain yourself to use only the portable SQL features. It's ok if your code does work only with PostgreSQL and no other database if this will solve your current needs. If there is a need to support other databases in the future, then handle this problem in the future, not today. PostgreSQL is great, and you waste time if you don't use its features.

Imagine there is a Meta-Programming-Language META (AFAIK this does not exist) and it is an official standard created by the ISO (like SQL). You can compile this Meta-Programming-Language to Java, Python, C, and other languages. But this Meta-Programming-Language would only support 70% of all features of the underlying programming languages. Would it make sense to say "My code must be portable, you must use META, you must not use implementation-specific stuff!"?. No, I think it would make no sense.

My conclusion: Use all features PostgreSQL has. Don't make your life more complicated than necessary and don't restrict yourself to use only portable SQL.

Great features PG has, which you might not know yet:

There is just one hint: Avoid storing binary data in PostgreSQL. An S3 service like minio is a better choice.

Where to not use PostgreSQL?

  • For embedded systems SQLite may fit better * Prefer SQLite if there will only be one process accessing the database at a time. As soon as there are multiple users/connections, you need to consider going elsewhere
  • TB-scale full-text search systems.
  • Scientific number crunching: hdf5
  • Caching: Redis fits better
  • Go with the flow: If you are wearing the admin hat (instead of the dev hat), and you should install (instead of developing) a product, then try the default DB (sometimes MySQL) first.

Source: PostgreSQL general mailing list: https://www.postgresql.org/message-id/5ded060e-866e-6c70-1754-349767234bbd%40thomas-guettler.de

Transactions do not nest

I love nested function calls and recursion. This way you can write easy to read code. For example recursion in quicksort is great.

Nested transactions ... sounds great. But stop: What is ACID about? This is about:

  • Atomicity
  • Consistency
  • Isolation
  • Durability

Database transactions are atomic. If the transaction was successful, then it is Durable.

Imagine you have one outer-transaction and two inner transactions.

  1. Transaction OUTER starts
  2. Transaction INNER1 starts
  3. Transaction INNER1 commits
  4. Transaction INNER2 starts
  5. Transaction INNER2 raises an exception.

Is the result of INNER1 durable or not?

Conclusion: Transactions do not nest

Related: http://stackoverflow.com/questions/39719567/not-nesting-version-of-atomic-in-django

The "partial transaction" concept in PostgreSQL is called savepoints. https://www.postgresql.org/docs/devel/sql-savepoint.html They capture linear portions of a transaction's work. Your use of them may be able to express a hierarchical expression of updates that may be preserved or rolled back, but the concept in PostgreSQL is not itself hierarchical.

My customer wants to extend the data schema...

Imagine you created some kind of issue-tracking system. Up until now, you provide attributes like "subject", "description", "datetime created", "datetime last-modified", "tags", "related issues", "priority", ...

Now the customer wants to add some new attributes to issues. It would be quite easy for you to update the database schema and update the code.

Maybe you are lucky and you have 100 customers. Then you would like to prefer to spend your time improving the core product. You don't want to spent too much time on the features which only one customer wants.

Or the customer wants to update the schema on its own.

What can you do now?

One solution is EAV: The Entity–attribute–value model

Why I don't want to work with MongoDB

MongoDB is a cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas. (Wikipedia)

One document in a collection can differ in its structure. For example, most all documents in a collection have an integer value on the attribute "foo", but for unknown reasons, one document has a float instead of an integer. Grrr.

What does the solution look like?

return try {
    this.getLong(key)
  } catch (e: ClassCastException) {
    if (this[key] is Double) this.getDouble(key).toLong() else null
  }

No! I want a clear schema where all values in a column are of the same type.

Of course, my wish has a draw-back: If you want

编辑推荐精选

潮际好麦

潮际好麦

AI赋能电商视觉革命,一站式智能商拍平台

潮际好麦深耕服装行业,是国内AI试衣效果最好的软件。使用先进AIGC能力为电商卖家批量提供优质的、低成本的商拍图。合作品牌有Shein、Lazada、安踏、百丽等65个国内外头部品牌,以及国内10万+淘宝、天猫、京东等主流平台的品牌商家,为卖家节省将近85%的出图成本,提升约3倍出图效率,让品牌能够快速上架。

iTerms

iTerms

企业专属的AI法律顾问

iTerms是法大大集团旗下法律子品牌,基于最先进的大语言模型(LLM)、专业的法律知识库和强大的智能体架构,帮助企业扫清合规障碍,筑牢风控防线,成为您企业专属的AI法律顾问。

SimilarWeb流量提升

SimilarWeb流量提升

稳定高效的流量提升解决方案,助力品牌曝光

稳定高效的流量提升解决方案,助力品牌曝光

Sora2视频免费生成

Sora2视频免费生成

最新版Sora2模型免费使用,一键生成无水印视频

最新版Sora2模型免费使用,一键生成无水印视频

Transly

Transly

实时语音翻译/同声传译工具

Transly是一个多场景的AI大语言模型驱动的同声传译、专业翻译助手,它拥有超精准的音频识别翻译能力,几乎零延迟的使用体验和支持多国语言可以让你带它走遍全球,无论你是留学生、商务人士、韩剧美剧爱好者,还是出国游玩、多国会议、跨国追星等等,都可以满足你所有需要同传的场景需求,线上线下通用,扫除语言障碍,让全世界的语言交流不再有国界。

讯飞绘文

讯飞绘文

选题、配图、成文,一站式创作,让内容运营更高效

讯飞绘文,一个AI集成平台,支持写作、选题、配图、排版和发布。高效生成适用于各类媒体的定制内容,加速品牌传播,提升内容营销效果。

AI助手热门AI工具AI创作AI辅助写作讯飞绘文内容运营个性化文章多平台分发
TRAE编程

TRAE编程

AI辅助编程,代码自动修复

Trae是一种自适应的集成开发环境(IDE),通过自动化和多元协作改变开发流程。利用Trae,团队能够更快速、精确地编写和部署代码,从而提高编程效率和项目交付速度。Trae具备上下文感知和代码自动完成功能,是提升开发效率的理想工具。

热门AI工具生产力协作转型TraeAI IDE
商汤小浣熊

商汤小浣熊

最强AI数据分析助手

小浣熊家族Raccoon,您的AI智能助手,致力于通过先进的人工智能技术,为用户提供高效、便捷的智能服务。无论是日常咨询还是专业问题解答,小浣熊都能以快速、准确的响应满足您的需求,让您的生活更加智能便捷。

imini AI

imini AI

像人一样思考的AI智能体

imini 是一款超级AI智能体,能根据人类指令,自主思考、自主完成、并且交付结果的AI智能体。

Keevx

Keevx

AI数字人视频创作平台

Keevx 一款开箱即用的AI数字人视频创作平台,广泛适用于电商广告、企业培训与社媒宣传,让全球企业与个人创作者无需拍摄剪辑,就能快速生成多语言、高质量的专业视频。

下拉加载更多