且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

VB.NET将Word文档转换为文本

更新时间:2023-02-08 09:20:59

文本与标准文本文件中一样?标准文本文件不支持粗体,斜体和突出显示,但文件中的所有文本除外. RTF支持RTF控件(RichTextBox)中的各种项目的粗体,斜体和突出显示. 但随后必须将数据另存为RTF,而不是文本.

您的问题听起来很奇怪.您想将Word Doc转换为Text,但提到HTML和剪贴板. Word Doc是在Word中显示的HTML文档吗?还是您将Word文档以某种方式保存为HTML?

也许您可能会更清楚地知道自己所做的一切,因为没人知道,但是您和您提供的说明对其他人来说并不十分详尽.


>

I understand how to extract html from a word doc via the clipboard, but I'd like to remove all the tags except bolding, italics and highlights.  Any suggestions?


Jnana Sivananda

Text as in a standard Text file? Bolding, italics and highlights are not supported in a standart text file except for all text within the file. RTF would support bolding, italics and highlights I suppose for various items in the RTF control (RichTextBox) but then the data would have to be saved as RTF and not Text.

Your question sounds strange. You want to convert a Word Doc to Text yet you mention HTML and the clipboard. Is the Word Doc an HTML document displayed in Word? Or are you saving a Word document to HTML somehow?

Maybe you could be more explicit in what everything you are doing is since nobody knows but you and the explanation you provide isn't very exhaustive for others to follow.