自制汉字字库，识别汉字（四）（实用篇）

时间：04-11来源：作者：点击数：10

识别常用汉字

1、例子介绍的“宋体-常规-小五”，实际使用的宋体主要是小五、五号、小四，所以还需要五号（14x14）、小四（16x16）的子模。其他更大更小的子模也就可以根据使用要求，比如一些标题以及注释就是更大和更小的字体。其他非宋体汉字识别也类似，根据需要制作。（游戏内的字体一般是宋体）

2、制作五号、小四子模（图片）

参考汉字篇中做法，得到14x14的图片：

得到16x16的图片：

3、自动化生成，

只需要稍微修改一下程序即可。

直接扫描目录下的png图片生成子模。create_dic函数已经是现成的。

#!/usr/bin/python
# -*- coding: gb18030 -*-

import os
from PIL import Image

#def find_color_rect(img, (x,y), width, height, color=(255,255,255)):
#def create_dic(img_N,N_width,ch_color,letter_pos, ch_txt_name):
#见汉字篇
    
    
def iterbrowse(path):
    for home, dirs, files in os.walk(path):
        for filename in files:
            yield os.path.join(home, filename)
            
if __name__ == "__main__":

    ch_dic_str = """#!/usr/bin/python
# -*- coding: gb18030 -*-

gCh_Dic= {
"""
    for fullname in iterbrowse("./"):
        if fullname[-4:] == ".png" or fullname[-4:] == ".PNG":
            st = fullname[2:]
            width = int(st[0:2])
            img = Image.open(st)
            #字母类的从第3521个开始的，只有汉字一半的宽度
            ch_dic_str += create_dic(img, width, (0,0,0), 3521, "3500.txt")

    ch_dic_str += """}"""

    ch_dir_py = open("ch_dic.py","w+")
    ch_dir_py.write(ch_dic_str)
    ch_dir_py.close()

完整代码见：链接：https://pan.baidu.com/s/1P-6wud9L21y3uRTQWqpEAA

提取码：e7v1

实际使用发现XP系统的宋体跟WIN7的有微小区别，所以对于XP也有重新制作一套12x12、14x14、16x16字体

4、在不确定字体大小的情况下，需要简单的匹配不同字号，这也会降低查找速度。

print_img_num(imgdic, (img.size[0], img.size[1]), (12, 12), (0,0,0))

print_img_num(imgdic, (img.size[0], img.size[1]), (14, 14), (0,0,0))

print_img_num(imgdic, (img.size[0], img.size[1]), (16, 16), (0,0,0))

实际检测会发现速度很慢。

5、目前为止代码是自用的，有比较大的局限性，很多代码未优化。

如下附上实际使用的例子。

a、使用环境对于字号比较熟悉，

b、游戏中获取地图坐标

c、实际使用也需要对于print_img_num函数进行调整，从而达到高效的处理。起始坐标、范围等(startx,starty,findw,findy)

d、实际使用可以直接通过截屏实时处理。img = ImageGrab.grab()

#!/usr/bin/python
# -*- coding: gb18030 -*-

import time
from PIL import Image
from ch_dic import *
   
def get_ch_dic(number, ch_width):
    if gCh_Dic.get(number, "-1") != "-1":
        return (gCh_Dic.get(number, "-1"), ch_width)    #返回ch_width宽度的汉字
    temp = number >> (ch_width*ch_width/2)    #找对应的字母
    if gCh_Dic.get(temp, "-1") != "-1":
        return (gCh_Dic.get(temp, "-1"), ch_width/2)
    return "-1",-1
    

def find_color_rect_Ex(img, (x,y), chw, chh, color=(255,255,0)):
    ret_find = 1 << (chw*chh)
    ret_find = 0
    i = 0
    point_color = 0,0,0
    while i < chw:
        ret_find <<= chh
        j = 0
        while j < chh:
            #point_color = imgdic[(x+i, y+j)]
            point_color = img.getpixel((x+i, y+j))
            if point_color[0] == color[0] and point_color[1] == color[1] and point_color[2] == color[2]:
                ret_find |= 1 << (chh-1 - j)
            j += 1        
        i += 1
    return ret_find

def print_img_num_Ex(img, (imgw, imgh), (startx,starty,findw,findy),(chw, chh), color):
    str_data = ""
    x, y = startx, starty
    while y <= starty + findy:
        x = startx
        sign = sign_space = 0
        while x <= startx + findw:
            st , skip_w = get_ch_dic(find_color_rect_Ex(img,(x,y),chw,chh,color), chw)
            if st != "-1":
                str_data += st
                x += skip_w
                if st != "_" and st != "—" and st != "-":
                    sign = 1
                sign_space = 1
                continue
            if sign_space:
                str_data += " "
            sign_space = 0
            x += 1
        if sign:
            y += chh
            str_data += "\n"
        else:
            y += 1
    print str_data
    

if __name__ == "__main__":

    time1 = time.time()
    img = Image.open("1.png")
    #img = ImageGrab.grab()
    
    print_img_num_Ex(img, (img.size[0], img.size[1]), (1130,33,160,20),(12, 12), (247,182,80))
    print time.time() - time1