且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

在.docx文件中查找和替换文本-Python

更新时间:2023-02-23 14:32:10

有时候,Word做一些奇怪的事情.您应该尝试删除文本并一键重写,例如不要在中间编辑文本

Sometimes, Word does strange things. You should try to remove the text and rewrite it in one stroke, eg without editing the text in the middle

您的文档保存在xml文件中(通常在解压缩后保存在docx的word/document.xml中).有时,您的文字可能不会一stroke而就:文档中的某个位置可能是XXXCLIENT,而其他位置可能是NAMEXXX.

Your document is saved in a xml file (usually in word/document.xml for docx, afer unzipping). Sometimes it is possible that your text won't be in one stroke: it is possible that somewhere in the document, they is XXXCLIENT and somewhere else they is NAMEXXX.

类似这样的东西:

< w:t>XXXCLIENT</w:t>...< w:t>NAMEXXX</w:t>

这种情况经常由于语言支持而发生:当单词认为一个单词属于一种特定语言时,单词会拆分单词,并且可能在单词之间进行拆分,这会将单词拆分为多个标签.

This happens quite often because of language support : word splits words when he thinks that one word is of one specific language, and may do so between words, that will split the words into multiple tags.

解决方案的唯一问题是,您必须一口气编写所有内容,这并不是最友好的操作方式.

Only problem with your solution is that you have to write everything in one stroke, which isn't the most user-friendly.

我创建了一个使用像胡子这样的标记的JS库:{clientName} https://github.com/edi9999/docxgenjs

I have created a JS Library that uses mustache like tags: {clientName} https://github.com/edi9999/docxgenjs

它在全局范围内与算法相同,但是如果内容不是一键便不会崩溃(当您在Word中写{clientName}时,文本通常会在文档中拆分成{,clientName,}.

It works globally the same as your algorithm but won't crash if the content is not in one stroke (when you write {clientName} in Word, the text will usually be splitted: {, clientName, } in the document.