473,473 Members | 1,901 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Couldn’t get equations in html when convert word .docx file to html file in C#.

1 New Member
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Expand|Select|Wrap|Line Numbers
  1. Globals.ThisAddIn.Application.ActiveDocument.Select();
  2. Microsoft.Office.Interop.Word.Document doc = Globals.ThisAddIn.Application.ActiveDocument;
  3.  
  4. string result = Path.GetTempPath();
  5.  
  6. string tmpFileName = Globals.ThisAddIn.Application.ActiveDocument.FullName;
  7. doc.SaveEncoding = Microsoft.Office.Core.MsoEncoding.msoEncodingUSASCII;
  8. if (File.Exists(result + "temp.html"))
  9. {
  10.     File.Delete(result + "temp.html");
  11. }
  12. doc.SaveAs(result + "temp.html", WdSaveFormat.wdFormatFilteredHTML); 
  13.  
  14. doc.Close(Microsoft.Office.Interop.Word.WdSaveOptions.wdDoNotSaveChanges);
  15.  
  16. HtmlAgilityPack.HtmlDocument mangledHTML = new HtmlAgilityPack.HtmlDocument();
  17. mangledHTML.Load(result + "temp.html");
  18.  
  19.  
  20. if (File.Exists(result + "newtemp.html"))
  21. {
  22.     File.Delete(result + "newtemp.html");
  23. }
  24.  
  25. mangledHTML.Save(result + "newtemp.html");
  26. // Remove standalone CRLF
  27.  
  28. string badHTML = File.ReadAllText(result + "newtemp.html");
  29. badHTML = badHTML.Replace("\r\n\r\n", "ackThbbtt ");
  30. badHTML = badHTML.Replace("\r\n", " ");
  31. badHTML = badHTML.Replace("ackThbbtt ", "\r\n");
  32. badHTML = badHTML.Replace('�', ' ');
  33. if (File.Exists(result + "finaltemp.html"))
  34. {
  35.     File.Delete(result + "finaltemp.html");
  36. }
  37. File.WriteAllText(result + "finaltemp.html", badHTML);
  38.  
  39. // Clean up temp files, show the finished result in Notepad
  40. File.Delete(result + "temp.html");
  41. File.Delete(result + "newtemp.html");
  42.  
  43. Microsoft.Office.Interop.Word.Document orignalDoc = new Document();
  44. orignalDoc = Globals.ThisAddIn.Application.Documents.Open(tmpFileName);
  45.  
  46.  
Basically, what I want to do is I want to store all word document paragraph data separately in database and I also want it’s all property like font size, font width, font name and font style. So that I can show it in my application as it is as I written in word document file.

To represent it as it is I need to convert it html format and the by sepreting all paragraphs I can store it in database. But when in my word document has paragraph which have equations then

Expand|Select|Wrap|Line Numbers
  1. Globals.ThisAddIn.Application.ActiveDocument.Select();
  2. Microsoft.Office.Interop.Word.Document doc = Globals.ThisAddIn.Application.ActiveDocument;
  3.  
  4. string result = Path.GetTempPath();
  5.  
  6. string tmpFileName = Globals.ThisAddIn.Application.ActiveDocument.FullName;
  7. doc.SaveEncoding = Microsoft.Office.Core.MsoEncoding.msoEncodingUSASCII;
This code converts my word documents all equations in Images and as it convert in image I can’t show the equation properly in my application.

So I tried to convert this equations in MATHML form but I couldn’t solve this.
1 Week Ago #1
0 4425

Sign in to post your reply or Sign up for a free account.

Similar topics

1
by: Ashutosh | last post by:
How can i convert Word file to txt file in ASP.NET using CSharp?
3
by: Chris Davoli | last post by:
I've got a requirement to build a page using MS WORD and then have the page show up on a web site. I know I can do a binary write and open up the WORD document in IE plugin. Don't really want to do...
1
by: ananth | last post by:
Hi All, Do anyone know how to get a word document in a rich text field and convert them into a HTML page programatically.The requirement is that there shouldnt be any third party tool...
1
by: firozfasilan | last post by:
I want the complete module for converting a word document to html file using visual basic 6 can you help me?
5
by: sangith | last post by:
Hi, How do I convert a word document into a text file. (For eg: If I give input as file1.doc, my Perl program should automatically convert it into file1.txt) Is there any Perl module which does...
0
DaBarrett
by: DaBarrett | last post by:
Hi, I tried to word repair 2007 document from the recycle bin on windows 2010 home edition. When I try to open it now i get the message; Word experienced an error trying to open this file. Try...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
1
muto222
php
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.