programming-guidelines

programming-guidelines

全面编程实践指南助力开发者提升技能

该项目提供全面的编程指南,内容涵盖数据结构、开发实践、远程API等多个方面。汇集了作者多年积累的经验教训,帮助开发者规避常见问题,优化代码质量。指南强调代码简洁性、可读性和数据结构设计,并就数据库使用、性能优化和团队协作给出实用建议。适合各层级开发人员参考学习,提升编程技能。

编程指南数据结构SQLPostgreSQLKISS原则Github开源项目

Programming Guidelines

My opinionated programming guidelines.

1. Introduction

2. Data structures

3. Dev

4. Remote APIs

5. Op

6. Networking

7. Monitoring

8. Communication with others

9. Epilog

1. Introduction

About this README

I was born in 1976. I started coding with basic and assembler when I was 13. Later turbo pascal. From 1996-2001 I studied computer science at HTW-Dresden (Germany). I learned Shell, Perl, Prolog, C, C++, Java, PHP, and finally Python.

Sometimes I see young and talented programmers wasting time. There are two ways to learn: Make mistakes yourself, or read from the mistakes which were done by other people.

This list summarises a lot of mistakes I did in the past. I wrote it, to help you, to avoid these mistakes.

It's my personal opinion and feeling. No facts, no single truth.

I need your feedback

If you have a general question, please start a new discussion.

If you think something is wrong or missing, feel free to open an issue or pull request.

Relaxed focus on your monitor

Do not look at the keyboard while you type. Have a relaxed focus on your monitor.

I type with ten fingers. It's like flying if you learned it. Your eyes can stay on the rubbish you type, and you don't need to move your eyes down (to keyboard) and up (to monitor) several hundred times per day. This saves a lot of energy. This is a simple tool to help you to learn touch typing: tipp10

Measure your typing speed: 10fastfingers.com

Avoid switching between mouse and keyboard too much.

I like Lenovo keyboards with track point. If you want more grip, then read Desktop Tips "Keyboard"

Once I was fascinated by the copy+paste history of Emacs and PyCharm. But then I thought to myself: "I want more. I am hungry. I want a copy+paste history not only in one application, but I also want it for the whole desktop". The solution is very simple, but somehow only a few people use it. The solution is called a clipboard manager. I use CopyQ. I use ctrl+alt+v to open the list of last copy+paste texts. CopyQ supports regex searches in the history.

Avoid searching with your eyes

Avoid searching with your eyes. Search with the tools of your IDE. You should be able to use it "blind". You should be able to move the cursor to the matching position in your code without looking at your keyboard, without grabbing your mouse/touchpad/TrackPoint and without looking up/down on your screen.

Compare two files with a diff tool, otherwise, you might get this ugly skeptical frown.

How often per day do you search for the mouse cursor on your screen? Support your eyes by increasing the cursor size. If you use Ubuntu, you can do it via Universal Access / Cursor Size

Increase font size

During daily work, you often jump from one information snippet to the next information snippet.

When was the last time you read a text with more than 20 sentences?

I think from time to time you should do so. Slow down, focus on one text, and read slowly. It helps to increase the font-size. ctrl-+ is your friend.

KISS

Keep it simple and stupid. The most boring and most obvious solution is often the best. Although it sometimes takes months until you know which solution it is.

From the book "Site Reliability Engineering" (O'Reilly Media 2016) https://landing.google.com/sre/book/chapters/simplicity.html

Quote:

: The Virtue of Boring

Unlike just about everything else in life, "boring" is a
positive attribute when it comes to software! We don’t want our programs to be spontaneous and interesting; we want them to stick to the script and predictably accomplish their business goals.

Example: Pure Functions are great. They are stateless, their output can be cached forever, they are easy to test.

Increase the obviousness

But it is not only about code. It is about the experience of all stakeholders: Users, salespeople, support hotline, developers,...

It is hard work to keep it simple.

One thing I love to do: "Increase the obviousness".

One tool to get there: Use a central wiki (without spaces), and define terms. Related text from me: Documentation in Intranets: My point of view

Avoid redundancy

See heading.

Premature optimization is the root of all evil.

The famous quote "premature optimization is the root of all evil." is true. You can read more about this here When to optimize.

MVP

You should know what an MVP (minimum valuable product) is. Building an MVP means to bring something useable to your customer, and then listen to their feedback. Care for their needs, not for your vision of a super performant application.

Avoid i18n in MVP. German is my mother tongue. If I develop a MVP for German users, than I won't to i18n. This can be done later, if needed.


2. Data structures

Introduction

"Bad programmers worry about the code. Good programmers worry about data structures and their relationships." -- Linus Torvalds (creator and developer of the Linux kernel and the version control system git)

Cache vs Database

There is a fundamental fact which you need to understand: The difference between a cache and a database.

Remember the basic Input-Process-Output pattern.

In a cache you store data which is output. That's handy since you can access the output without doing the processing again. But cache-invalidation is hard. Maybe the input has changed, and the value in the cache is outdated? Who knows? If possible avoid caching, since this will never give you outdated data. You don't need to backup your cache data. You can create it again.

In a database you store data which is input. Usually it was entered by a human by hand, or generated by measuring some real word data. You can use the data in database to create a nice HTML page. It is important to backup your valuable database data, since you can't create it again. The generated output (HTML, JSON, ...) has no value.

Data which is input usualy has value. Data which is output has only little value, since you can re-create it again.

Relational Database

I know SQL is..... It is either obvious or incomprehensible. And, yes, it is boring.

A relational database is a rock-solid data storage. Use it.

When I studied computer science, I disliked SQL. I thought it was an outdated solution. I tried to store data in files in XML format, used in memory Berkley-DB, I used an object-oriented database written in Python (ZODB), I used NoSQL .... And finally, I realized that boring SQL is the best solution for most cases.

I use PostgreSQL.

I don't like NoSQL, except for caching (simple key-value DB).

The PostgreSQL Documentation contains an introduction to SQL and is easy to read.

If you want to share small SQL snippets, you can use https://dbfiddle.uk/

Cardinality

It does not matter how you work with your data (struct in C, classes in OOP, tables in SQL, ...). Cardinality is very important. Using 0..* is often easier to implement than 0..1. The first can be handled by a simple loop. The second is often a nullable column/attribute. You need conditions (IFs) to handle nullable columns/attributes.

https://en.wikipedia.org/wiki/Cardinality_(data_modeling)

If this is new to you, I will give you two examples:

  • 1:N --> One invoice has several invoice positions. For example, you buy three books in one order, the invoice will have three invoice positions. This is a 1:N relationship. The invoice position is contained in exactly one invoice.
  • N:M --> If you look at tags, for example at the Question+Answer site StackOverflow: One question can be related to several tags/topics and of course a topic can be set on several questions. For example, you have a strange UnicodeError in Python then you can set the tags "python" and "unicode" on your question. This is an N:M relationship. One well know example of N:M is user and groups.

Conditionless Data Structures

If you have no conditions in your data structures, then the coding for the input/output of your data will be much easier.

Avoid nullable Foreign Keys

Imagine you have a table "meeting" and a table "place". The table "meeting" has a ForeignKey to table "place". In the beginning, it might be not clear where the meeting will be. Most developers will make the ForeignKey optional (nullable). WAIT: This will create a condition in your data structure. There is a way easier solution: Create a place called "unknown". Use this senitel value as default. This data structure (without a nullable ForeignKey) makes implementing the GUI much easier.

In other words: If there is no NULL in your data, then there will be less NullPointerException in your source code while processing the data :-)

Fewer conditions, fewer bugs.

Avoid nullable boolean columns

[True, False, Unknown] is not a nullable Boolean Column.

If you want to store data in a SQL database that has three states (True, False, Unknown), then you might think a nullable boolean column (here "my_column") is the right choice. But I think it is not. Do you think the SQL statement "select * from my_table where my_column = %s" works? No, it won't work since "select * from my_table where my_column = NULL" will never return a single line. If you don't believe me, read: Effect of NULL in WHERE clauses (Wikipedia). If you like typing, you can work-around this in your application, but I prefer straightforward solutions with only a few conditions.

If you want to store True, False, Unknown: Use text, integer, or a new table and a foreign key.

Avoid nullable characters columns

If you allow NULL in a character column, then you have two ways to express "empty":

  • NULL
  • empty string

Avoid it if possible. In most cases, you just need one variant of "empty". Simplest solution: avoid that a column holding character data is allowed to be null.

If you think the character column should be allowed to be NULL (for example you want a unique, but optional identifier for rows), then consider a constraint: If the character string in the column is not NULL, then the string must not be empty. This way ensure that there are is only one variant of "empty".

SQL: I prefer subqueries to joins

In most cases, I use an ORM to access data and don't write SQL by hand.

If I do write SQL by hand, then I often prefer SQL Subqueries to SQL Joins.

Have a look at this example:

SELECT id, name
FROM products
WHERE category_id IN
   (SELECT id
    FROM categories
    WHERE expired = True)

I can translate this to human language easily: Select all products, which belong to a category that has expired.

Use all features PostgreSQL does offer

If you want to store structured data, then PostgreSQL is a safe default choice. It fits in most cases. Use all features PostgreSQL does offer. Don't constrain yourself to use only the portable SQL features. It's ok if your code does work only with PostgreSQL and no other database if this will solve your current needs. If there is a need to support other databases in the future, then handle this problem in the future, not today. PostgreSQL is great, and you waste time if you don't use its features.

Imagine there is a Meta-Programming-Language META (AFAIK this does not exist) and it is an official standard created by the ISO (like SQL). You can compile this Meta-Programming-Language to Java, Python, C, and other languages. But this Meta-Programming-Language would only support 70% of all features of the underlying programming languages. Would it make sense to say "My code must be portable, you must use META, you must not use implementation-specific stuff!"?. No, I think it would make no sense.

My conclusion: Use all features PostgreSQL has. Don't make your life more complicated than necessary and don't restrict yourself to use only portable SQL.

Great features PG has, which you might not know yet:

There is just one hint: Avoid storing binary data in PostgreSQL. An S3 service like minio is a better choice.

Where to not use PostgreSQL?

  • For embedded systems SQLite may fit better * Prefer SQLite if there will only be one process accessing the database at a time. As soon as there are multiple users/connections, you need to consider going elsewhere
  • TB-scale full-text search systems.
  • Scientific number crunching: hdf5
  • Caching: Redis fits better
  • Go with the flow: If you are wearing the admin hat (instead of the dev hat), and you should install (instead of developing) a product, then try the default DB (sometimes MySQL) first.

Source: PostgreSQL general mailing list: https://www.postgresql.org/message-id/5ded060e-866e-6c70-1754-349767234bbd%40thomas-guettler.de

Transactions do not nest

I love nested function calls and recursion. This way you can write easy to read code. For example recursion in quicksort is great.

Nested transactions ... sounds great. But stop: What is ACID about? This is about:

  • Atomicity
  • Consistency
  • Isolation
  • Durability

Database transactions are atomic. If the transaction was successful, then it is Durable.

Imagine you have one outer-transaction and two inner transactions.

  1. Transaction OUTER starts
  2. Transaction INNER1 starts
  3. Transaction INNER1 commits
  4. Transaction INNER2 starts
  5. Transaction INNER2 raises an exception.

Is the result of INNER1 durable or not?

Conclusion: Transactions do not nest

Related: http://stackoverflow.com/questions/39719567/not-nesting-version-of-atomic-in-django

The "partial transaction" concept in PostgreSQL is called savepoints. https://www.postgresql.org/docs/devel/sql-savepoint.html They capture linear portions of a transaction's work. Your use of them may be able to express a hierarchical expression of updates that may be preserved or rolled back, but the concept in PostgreSQL is not itself hierarchical.

My customer wants to extend the data schema...

Imagine you created some kind of issue-tracking system. Up until now, you provide attributes like "subject", "description", "datetime created", "datetime last-modified", "tags", "related issues", "priority", ...

Now the customer wants to add some new attributes to issues. It would be quite easy for you to update the database schema and update the code.

Maybe you are lucky and you have 100 customers. Then you would like to prefer to spend your time improving the core product. You don't want to spent too much time on the features which only one customer wants.

Or the customer wants to update the schema on its own.

What can you do now?

One solution is EAV: The Entity–attribute–value model

Why I don't want to work with MongoDB

MongoDB is a cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas. (Wikipedia)

One document in a collection can differ in its structure. For example, most all documents in a collection have an integer value on the attribute "foo", but for unknown reasons, one document has a float instead of an integer. Grrr.

What does the solution look like?

return try {
    this.getLong(key)
  } catch (e: ClassCastException) {
    if (this[key] is Double) this.getDouble(key).toLong() else null
  }

No! I want a clear schema where all values in a column are of the same type.

Of course, my wish has a draw-back: If you want

编辑推荐精选

TRAE编程

TRAE编程

AI辅助编程,代码自动修复

Trae是一种自适应的集成开发环境(IDE),通过自动化和多元协作改变开发流程。利用Trae,团队能够更快速、精确地编写和部署代码,从而提高编程效率和项目交付速度。Trae具备上下文感知和代码自动完成功能,是提升开发效率的理想工具。

热门AI工具生产力协作转型TraeAI IDE
蛙蛙写作

蛙蛙写作

AI小说写作助手,一站式润色、改写、扩写

蛙蛙写作—国内先进的AI写作平台,涵盖小说、学术、社交媒体等多场景。提供续写、改写、润色等功能,助力创作者高效优化写作流程。界面简洁,功能全面,适合各类写作者提升内容品质和工作效率。

AI助手AI工具AI写作工具AI辅助写作蛙蛙写作学术助手办公助手营销助手
问小白

问小白

全能AI智能助手,随时解答生活与工作的多样问题

问小白,由元石科技研发的AI智能助手,快速准确地解答各种生活和工作问题,包括但不限于搜索、规划和社交互动,帮助用户在日常生活中提高效率,轻松管理个人事务。

聊天机器人AI助手热门AI工具AI对话
Transly

Transly

实时语音翻译/同声传译工具

Transly是一个多场景的AI大语言模型驱动的同声传译、专业翻译助手,它拥有超精准的音频识别翻译能力,几乎零延迟的使用体验和支持多国语言可以让你带它走遍全球,无论你是留学生、商务人士、韩剧美剧爱好者,还是出国游玩、多国会议、跨国追星等等,都可以满足你所有需要同传的场景需求,线上线下通用,扫除语言障碍,让全世界的语言交流不再有国界。

讯飞智文

讯飞智文

一键生成PPT和Word,让学习生活更轻松

讯飞智文是一个利用 AI 技术的项目,能够帮助用户生成 PPT 以及各类文档。无论是商业领域的市场分析报告、年度目标制定,还是学生群体的职业生涯规划、实习避坑指南,亦或是活动策划、旅游攻略等内容,它都能提供支持,帮助用户精准表达,轻松呈现各种信息。

热门AI工具AI办公办公工具讯飞智文AI在线生成PPTAI撰写助手多语种文档生成AI自动配图
讯飞星火

讯飞星火

深度推理能力全新升级,全面对标OpenAI o1

科大讯飞的星火大模型,支持语言理解、知识问答和文本创作等多功能,适用于多种文件和业务场景,提升办公和日常生活的效率。讯飞星火是一个提供丰富智能服务的平台,涵盖科技资讯、图像创作、写作辅助、编程解答、科研文献解读等功能,能为不同需求的用户提供便捷高效的帮助,助力用户轻松获取信息、解决问题,满足多样化使用场景。

模型训练热门AI工具内容创作智能问答AI开发讯飞星火大模型多语种支持智慧生活
Spark-TTS

Spark-TTS

一种基于大语言模型的高效单流解耦语音令牌文本到语音合成模型

Spark-TTS 是一个基于 PyTorch 的开源文本到语音合成项目,由多个知名机构联合参与。该项目提供了高效的 LLM(大语言模型)驱动的语音合成方案,支持语音克隆和语音创建功能,可通过命令行界面(CLI)和 Web UI 两种方式使用。用户可以根据需求调整语音的性别、音高、速度等参数,生成高质量的语音。该项目适用于多种场景,如有声读物制作、智能语音助手开发等。

咔片PPT

咔片PPT

AI助力,做PPT更简单!

咔片是一款轻量化在线演示设计工具,借助 AI 技术,实现从内容生成到智能设计的一站式 PPT 制作服务。支持多种文档格式导入生成 PPT,提供海量模板、智能美化、素材替换等功能,适用于销售、教师、学生等各类人群,能高效制作出高品质 PPT,满足不同场景演示需求。

讯飞绘文

讯飞绘文

选题、配图、成文,一站式创作,让内容运营更高效

讯飞绘文,一个AI集成平台,支持写作、选题、配图、排版和发布。高效生成适用于各类媒体的定制内容,加速品牌传播,提升内容营销效果。

AI助手热门AI工具AI创作AI辅助写作讯飞绘文内容运营个性化文章多平台分发
材料星

材料星

专业的AI公文写作平台,公文写作神器

AI 材料星,专业的 AI 公文写作辅助平台,为体制内工作人员提供高效的公文写作解决方案。拥有海量公文文库、9 大核心 AI 功能,支持 30 + 文稿类型生成,助力快速完成领导讲话、工作总结、述职报告等材料,提升办公效率,是体制打工人的得力写作神器。

下拉加载更多