Image_Processing

Image_Processing

全面的图像处理实践指南与代码实现

Image_Processing项目提供了从基础到高级的图像处理实践指南。涵盖图像插值、几何变换、边缘检测等多个主题,通过丰富的代码示例帮助开发者掌握各种图像处理技术。该项目是入门图像处理的理想学习资源,适合希望系统学习图像处理的开发人员。

图像处理数字图像处理插值图像变换边缘检测Github开源项目

Road To Pixels

Welcome aboard. With the growing technologies out in the world, we have seen how important Image Processing has become. This repository provides a complete understanding of the practical implementation of all the concepts to be known for a developer to start their Image Processing journey.

Contents

  1. Basics with Images
  2. Successive Rotations
  3. Interpolations
  4. Interpolations-Inverse Mapping
  5. Basic Transformations
  6. Perspective Transofrmation
  7. Estimating the Transformation
  8. Log and Contrast Stretching
  9. Shading Correction
  10. Laplacian
  11. Laplacian+Gaussian
  12. Laplacian, Sobel, CannyEdge
  13. Sobel-X and Y
  14. Histogram Equalisation
  15. Normalize Histogram
  16. Image Temperature
  17. Box Filter
  18. GaussianFilter+Kernels
  19. Morphological Processing
  20. Morphological Text Processing
  21. Morphological Fingerprint Processing
  22. Morphological Outline
  23. Capture Video Frames
  24. Video background Subtraction
  25. VideoCapture_GoogleColab
  26. Contours-OpenCV
  27. Fitting Polygons
  28. Hough Lines
  29. Adaptive+Gaussian Thresholding
  30. OTSU Thresholding
  31. Grabcut
  32. Discrete Fourier Transformation
  33. OpenCV KMeans
  34. Object Movement Tracking
  35. Live Hand Gesture Recognition

Before we jump into the concepts, let us once have a look at the definition of Image Processing.

A Glance into Image Processing

Image processing is often viewed as arbitrarily manipulating an image to achieve an aesthetic standard or to support a preferred reality. However, image processing is more accurately defined as a means of translation between the human visual system and digital imaging devices. The human visual system does not perceive the world in the same manner as digital detectors, with display devices imposing additional noise and bandwidth restrictions. Salient differences between the human and digital detectors will be shown, along with some basic processing steps for achieving translation. Image processing must be approached in a manner consistent with the scientific method so that others may reproduce, and validate one's results. This includes recording and reporting processing actions and applying similar treatments to adequate control images.Src

There are two types of methods used for image processing namely, analog and digital image processing. Analog image processing can be used for hard copies like printouts and photographs. Various fundamentals of interpretation are used by the Image Analysts along with the visual techniques. Digital image processing deals with the manipulation of digital images through a digital computer. It is a subfield of signals and systems but focuses particularly on images. The three general phases that all types of data have to undergo while using digital techniques are

  • Pre-processing
  • Enhancement and Display
  • Information Extraction.

Fundamental Steps in DIP Fundamental Steps in Digital Image Processing - Rafael Gonzalez - 4th Edition Src

Important point to note while going through any concept is that the image is considered on a greyscale since color increases the complexity of the model. One may want to introduce an image processing tool using gray level images because of the format of gray-level images because the inherent complexity of gray-level images is lower than that of color images. In most cases. after presenting a gray-level image method, it can be extended to color images.

For getting deeper insights into any of the concepts, I suggest going through Digital Image Processing, Rafael C. Gonzalez • Richard E. Woods, 4th Edition

From here on I will be referring Digital Image Processing as DIP.

Disclaimer: I am not the original author of the images used. They have been taken from various Image Processing sites. I have mentioned all of the referenced sites in resources. Pardon if I missed any.

The following is the order I suggest to look into the concepts.

1. Basics with Images - Averaging Images

Image averaging is a DIP technique that is used to enhance the images which are corrupted with random noise. The arithmetic mean of the intensity values for each pixel position is computed for a set of images of the same view field. The basic formula behind it is.
Image Averaging over set of N images

2. Successive Rotations - Code

The images are rotated using the self-defined code for rotation instead of the OpenCV inbuilt function. When an image is rotated by 45 degrees for 8 times, it does not produce the same result as when it is rotated by 90 degrees for 4 times. This is because, when an image is rotated 45 degrees, during the rotation more pixels values for the new position of the pixels are to be calculated. And calculating these new pixel positions and their intensities uses interpolation which is an approximation method. So when an image is rotated by 90 degrees there is a smoother transition since fewer no of approximations are to be made for the new pixel positions and their intensities.

A clear example is shown below

Rotated by 45 deg - 8 timesRotated by 90 deg - 4 times

3. Interpolations - Code

Interpolation is used in tasks such as zooming, shrinking, rotating, and geometrically correcting digital images. It is the process of using known data to estimate values at unknown locations. So for giving the chance to estimate values, we will do some transformation, here it is a rotation by 45 degrees. The 3 interpolations we see here are:

Nearest NeighbourBilinearBicubic

Here you can see a slight variation between the 3 images. The smoothness gets better from left to right. Since Bicubic interpolation uses a higher-order equation it can capture features in-depth.

4. Interpolation-Inverse Mapping - Code

As mentioned here, there are two methods of mapping, the first, called forward mapping, scans through the source image pixel by pixel, and copies them to the appropriate place in the destination image. The second, reverse mapping, goes through the destination image pixel by pixel and samples the correct pixel from the source image. The most important feature of inverse mapping is that every pixel in the destination image gets set to something appropriate. In the forward mapping case, some pixels in the destination might not get painted and would have to be interpolated. We calculate the image deformation as a reverse mapping.

OriginalNearest Neighbour - Inverse Mapping

5. Basic Transformations - Code

We have seen the basic transformations like rotation and scaling. Now let's see one more basic transformation known as translation.

OriginalTranslation

6. Perspective Transformation - Code

The perspective transformation deals with the conversion of a 3D image into a 2D image for getting better insights about the required information. The 3D object co-ordinates are changed into the co-ordinates wrt world frame of reference and according to camera coordinate frame reference then continued by changing into Image Plane 2D coordinates and then to the pixel coordinates.

Distorted ImageOpenCV - Perspective Transf FunctionManual Correction

7. Est. Transformation - Code

This is just an example of using custom transformations for the required purpose. In the below example I have tried to extract the root part from the image.

OriginalTransformed

8. Log and Contrast Stretching - Code

One of the grey-level transformations is Logarithmic Transformation. It is defined as s = c*log(r+1) , where 's' and 'r' are the pixel values of the output and the input image respectively and 'c' is a constant.

OriginalLog-Transformed

Contrast Stretching is a simple image enhancement technique that attempts to improve the contrast in an image by stretching the range of intensity values it contains to span a desired range of values.

OriginalContrast Stretched

9. Shading Correction - Code

Shading Correction is used for correcting the parts of an image which are having some faults due to multiple reasons like, camera light obstruction. So correcting the image for the required purpose is essential. So in this example, we have used a faulty image of a chessboard and corrected the image. Gaussian Blur is used to correct the shading in the corner of the image.

OriginalCorrected Image

10. Laplacian - Code

A Laplacian filter is an edge detector which computes the second derivatives of an image, measuring the rate at which the first derivatives change. That determines if a change in adjacent pixel values is from an edge or continuous progression. A laplacian filter or kernel looks like this:
[0, 1, 0]
[1, -4, 1]
[0, 1, 0]

But a point to note is that Laplacian is very sensitive to noise. It even detects the edges for the noise in the image.

OriginalLaplacian Filter

11. Laplacian+Gaussian - Code

As you can see from the above example, the Laplacian kernel is very sensitive to noise. Hence we use the Gaussian Filter to first smoothen the image and remove the

编辑推荐精选

Trae

Trae

字节跳动发布的AI编程神器IDE

Trae是一种自适应的集成开发环境(IDE),通过自动化和多元协作改变开发流程。利用Trae,团队能够更快速、精确地编写和部署代码,从而提高编程效率和项目交付速度。Trae具备上下文感知和代码自动完成功能,是提升开发效率的理想工具。

热门AI工具生产力协作转型TraeAI IDE
问小白

问小白

全能AI智能助手,随时解答生活与工作的多样问题

问小白,由元石科技研发的AI智能助手,快速准确地解答各种生活和工作问题,包括但不限于搜索、规划和社交互动,帮助用户在日常生活中提高效率,轻松管理个人事务。

聊天机器人AI助手热门AI工具AI对话
Transly

Transly

实时语音翻译/同声传译工具

Transly是一个多场景的AI大语言模型驱动的同声传译、专业翻译助手,它拥有超精准的音频识别翻译能力,几乎零延迟的使用体验和支持多国语言可以让你带它走遍全球,无论你是留学生、商务人士、韩剧美剧爱好者,还是出国游玩、多国会议、跨国追星等等,都可以满足你所有需要同传的场景需求,线上线下通用,扫除语言障碍,让全世界的语言交流不再有国界。

讯飞智文

讯飞智文

一键生成PPT和Word,让学习生活更轻松

讯飞智文是一个利用 AI 技术的项目,能够帮助用户生成 PPT 以及各类文档。无论是商业领域的市场分析报告、年度目标制定,还是学生群体的职业生涯规划、实习避坑指南,亦或是活动策划、旅游攻略等内容,它都能提供支持,帮助用户精准表达,轻松呈现各种信息。

热门AI工具AI办公办公工具讯飞智文AI在线生成PPTAI撰写助手多语种文档生成AI自动配图
讯飞星火

讯飞星火

深度推理能力全新升级,全面对标OpenAI o1

科大讯飞的星火大模型,支持语言理解、知识问答和文本创作等多功能,适用于多种文件和业务场景,提升办公和日常生活的效率。讯飞星火是一个提供丰富智能服务的平台,涵盖科技资讯、图像创作、写作辅助、编程解答、科研文献解读等功能,能为不同需求的用户提供便捷高效的帮助,助力用户轻松获取信息、解决问题,满足多样化使用场景。

模型训练热门AI工具内容创作智能问答AI开发讯飞星火大模型多语种支持智慧生活
Spark-TTS

Spark-TTS

一种基于大语言模型的高效单流解耦语音令牌文本到语音合成模型

Spark-TTS 是一个基于 PyTorch 的开源文本到语音合成项目,由多个知名机构联合参与。该项目提供了高效的 LLM(大语言模型)驱动的语音合成方案,支持语音克隆和语音创建功能,可通过命令行界面(CLI)和 Web UI 两种方式使用。用户可以根据需求调整语音的性别、音高、速度等参数,生成高质量的语音。该项目适用于多种场景,如有声读物制作、智能语音助手开发等。

咔片PPT

咔片PPT

AI助力,做PPT更简单!

咔片是一款轻量化在线演示设计工具,借助 AI 技术,实现从内容生成到智能设计的一站式 PPT 制作服务。支持多种文档格式导入生成 PPT,提供海量模板、智能美化、素材替换等功能,适用于销售、教师、学生等各类人群,能高效制作出高品质 PPT,满足不同场景演示需求。

讯飞绘文

讯飞绘文

选题、配图、成文,一站式创作,让内容运营更高效

讯飞绘文,一个AI集成平台,支持写作、选题、配图、排版和发布。高效生成适用于各类媒体的定制内容,加速品牌传播,提升内容营销效果。

AI助手热门AI工具AI创作AI辅助写作讯飞绘文内容运营个性化文章多平台分发
材料星

材料星

专业的AI公文写作平台,公文写作神器

AI 材料星,专业的 AI 公文写作辅助平台,为体制内工作人员提供高效的公文写作解决方案。拥有海量公文文库、9 大核心 AI 功能,支持 30 + 文稿类型生成,助力快速完成领导讲话、工作总结、述职报告等材料,提升办公效率,是体制打工人的得力写作神器。

openai-agents-python

openai-agents-python

OpenAI Agents SDK,助力开发者便捷使用 OpenAI 相关功能。

openai-agents-python 是 OpenAI 推出的一款强大 Python SDK,它为开发者提供了与 OpenAI 模型交互的高效工具,支持工具调用、结果处理、追踪等功能,涵盖多种应用场景,如研究助手、财务研究等,能显著提升开发效率,让开发者更轻松地利用 OpenAI 的技术优势。

下拉加载更多