Saving a Web Page Into A Single File
Posted
by Onega Onega
on February 18th, 2003
#import "c:\program files\common files\system\ado\msado15.dll" _
no_namespace rename("EOF", "EndOfFile")
#import no_namespace rename("EOF", "EndOfFile")
...
void SaveWholePage(LPCTSTR page_url,LPCTSTR save_filename)
{
CoInitialize(NULL);
{
IMessagePtr iMsg(__uuidof(Message));
IConfigurationPtr iConf(__uuidof(Configuration));
iMsg->Configuration = iConf;
try
{
iMsg->CreateMHTMLBody(
page_url,
cdoSuppressNone,
"domain\\username",
"password");
}
catch(_com_error err)
{
// handle exception
}
_StreamPtr pStream=iMsg->GetStream();
pStream->SaveToFile( save_filename,
adSaveCreateOverWrite);
}
CoUninitialize();
}
Sample Usage:
SaveWholePage("http://www.zaobao.com/gj/zg002_050203.html",
"test.mht");
Sample program is not provided, since you can create it on your own and copy & paste the above code snippet into your project.

Comments
How to save a picture(*.jpg or *.gif) in web page as a single *.jpg or *.gif file ?
Posted by toddson on 01/30/2008 07:27pmCan we download a picture file in web page and save it as a single picture file?
Replyexcellent
Posted by china007 on 09/25/2007 03:24amvery helpful,thanks.
ReplyIMessage' : undeclared identifier
Posted by haku_nin on 06/26/2007 09:44pmHi, I was trying this example, and also the correction code that Keedo gave, I got this error: IMessage' : undeclared identifier Any thoughts how to solve this problem? Thanks Haku
ReplyHow to save complete web page as .htm file instead of .mht file?
Posted by baskarchinnu on 02/28/2005 05:50amHi I am working in Win32 API SDK, at my program, I am hooking IE window and I am using my own BHO dll (whenever IE is open, dll will attach with IE browser) through this dll I want to save the complete web page without prompting the save as dialogbox. I tried by using as follows HRESULT hr = m_pWebBrowser->ExecWB(OLECMDID_SAVEAS, MSOCMDEXECOPT_DONTPROMPTUSER, NULL, NULL); I used MSOCMDEXECOPT_DONTPROMPTUSER options, even though I am getting dialog box and even if try to save the web page through save as dialog box also, page is not saving. I am getting the error msg "This Web page could not be saved". Can you please help me to save the complete web page without prompting save..as dialog box. If Anyone help to get an answer for this, it would be very much helpful for meReplyFunction Fixed - Have working demo
Posted by keedo60 on 04/19/2004 01:08amI now have this code working on my Windows 2000 Professional system, using MS Visual C++.
The errors generated by the compiler were caused by incorrect import statements.
The correct statements are shown below.
#import "c:\program files\common files\system\ado\msado15.dll" no_namespace raw_interfaces_only
#import "C:\WINNT\system32\cdosys.dll" no_namespace raw_interfaces_only
#include "C:\Program Files\Microsoft SDK\include\cdosysstr.h"
#include "C:\Program Files\Microsoft SDK\include\cdosyserr.h"
Make sure each import and include statment is on a single line. Also, note that these files may
have different paths on your system.
If you cannot find these files on your system, you will probably have to download the Platform
SDK from Microsoft.
You will need the Core SDK and the MDAC SDK component (this is the one with the CDO/ADO
support). After installation you will find the header files in the include directory of the SDK
folder. The cdosys.dll should be in your system folder.
The rest of the code is OK, except you must either uncomment the //return 1 statement, and
add the required semicolon, or redefine the function as void.
Here is that code again:
int CDBrowseView::SaveWholePage(CString szPageURL,CString szFileName) { CoInitialize(NULL); BSTR bstr=szPageURL.AllocSysString(); CString szUserName="domain\\username"; BSTR bstrUserName=szUserName.AllocSysString(); CString szPass="domain\\username"; BSTR bstrPass=szPass.AllocSysString(); IMessage *pMsg=NULL; IConfiguration* pConfig = NULL; _Stream* pStm = NULL; HRESULT hr=CoCreateInstance( __uuidof(Message), NULL, CLSCTX_INPROC_SERVER, __uuidof(IMessage), (void**)&pMsg); hr=CoCreateInstance( __uuidof(Configuration), NULL, CLSCTX_INPROC_SERVER, __uuidof(IConfiguration), (void**)&pConfig); //pMsg->Configuration = pConfig; pMsg->put_Configuration (pConfig); try { pMsg->CreateMHTMLBody( bstr, cdoSuppressNone, bstrUserName, bstrPass ); } catch(_com_error err) { // handle exception AfxMessageBox("Exception"); return 0; } _StreamPtr pStream; pMsg->GetStream(&pStm); pStm->SaveToFile( szFileName.AllocSysString(), adSaveCreateOverWrite); pMsg->Release(); pStm->Release(); CoUninitialize(); return 1; }Just as an aside, in the event that you want to use a Save As dialog to call the function from,here is the code for that:
void CDBrowseView::OnHtmSave() { static char szFilter1[] = "MHT File (*.mht)|*.mht|Email File (*.eml)|*.eml||"; // FALSE gives you a file save dialog box, TRUE a file open CFileDialog m_FileDialog(FALSE, "mht", "*.mht", OFN_OVERWRITEPROMPT | OFN_HIDEREADONLY, szFilter1, NULL); if (m_FileDialog.DoModal() == IDOK) { // URL of web page to be saved CString szUrl = CDBrowseView::GetLocationURL(); // filename and path return from save as dialog CString szFile = m_FileDialog.GetPathName(); //call save page function int r = SaveWholePage(szUrl, szFile); // handle return code from function if(r == 1) AfxMessageBox("File Saved "); else AfxMessageBox("Save Failed "); } }-
-
Reply
-
-
-
-
Reply
-
-
-
-
Reply
-
ReplyIs this a command console app?
Posted by GPBraaten on 10/03/2006 12:27pmHi Keedo60, The novice is here. I'm looking to run an app like yours from a Command Console prompt that feeds my user_id & password to our Intranet on a scheduled basis. I tried to compile your latest code into a VS 2003 C++ Console App, but recieved several error messages. Am I off base & missed something? Lines 11 & 12 bombed the most. Thanks.
Replyre: Not able to overwrite file
Posted by keedo60 on 01/11/2005 07:51pmI am guessing that it has to do with your Internet Explorer settings, and since the file has the same URL it is being written from the copy stored in the browser cache. Try going to TOOLS/INTERNET OPTIONS menu of MSIE, click the GENERAL tab, then click SETTINGS in Temporary Internet Files section, and check the box EVERY VISIT TO PAGE.
This should solve your problem. This will force the browser to update content on refresh or revisit.
CHtmlView, the IWebBrowser component, and the SDK API's all borrow from MSIE functionality, and so are still limited and regulated my IE settings. Keep this in mind, if you are developing apps for general distribution.
Not able to overwrite MHTML file once generated
Posted by prasannak on 01/11/2005 09:29amHello,I can compile the code, and it works properly for the first time.If i try to send the same URL with different DATA/CONTENT in HTML file, the changes does not get reflected in MHTML file and i always see the first created MHTML file. Please help me out in solving this.Thank you.
ReplyNot working for some web pages
Posted by keedo60 on 12/14/2004 09:00pmPS. I also noticed that the page you referred to (http://deskshare.com/download.aspx/), is an ASP
Reply(dynamically generated) page. Many dynamically generated pages (including PHP)
do not save correctly even from the full-blown IE using it's built in SAVE AS feature. For optimal
reliability -- and certainly for testing -- you should try to avoid using ASP and PHP urls with
this function. Further, pages which make heavy use of java applets, Flash objects, and/or
ActiveX controls, will pose problems as well.
Not working for some web pages
Posted by keedo60 on 12/14/2004 08:48pmAjay.. you say this happens when navigating the pages in CHtmlView. You must realize that the
Replybrowser control used in CHtmlView is merely a mini-browser and does not fully support all the
same technology that the full-blown IE does.
Try navigating these websites directly in CHtmlView using the Navigate2() function and you
will probably generate the same script errors. This is because the IWebBrowser2 control only has
limited scripting support.
Here's the true test though... Once you save your file with this code... can you open the saved
page in your default browser normally. If so, then the code is working perfectly.
CreateMHTMLBody fails
Posted by keedo60 on 12/14/2004 08:36pmCan anyone explain why CreateMHTMLBody call fails?
Posted by jschen on 11/21/2004 08:46pmI'm in front of the same problem as phille and Ajay Sonawane.I need helps,please!
ReplyCreateMHTMLBody call fails
Posted by phille on 06/22/2004 03:27amHello, I can compile the code, but the call to CreateMHTMLBody always fails and returns HRESULT 0x800401E4 ("Invalid syntax"). Does anyone have an idea what the cause could be?
ReplyNot working for some web pages..Why?
Posted by Ajay Sonawane on 04/27/2004 01:10amHello, I tried the same code for various web pages.But when I tried to navigate those MHTML files in CHtmlView , it flashed some scripting errors And unable to load and show some .gif images.Can you explain me why it was so? You can try the below link http://deskshare.com/download.aspx/
Replyneed more info
Posted by keedo60 on 04/25/2004 07:34pmCan't work, please help me!
Posted by Tomol on 04/21/2004 08:32amI try it,but the codes can't work yet! I got a file that only have 268 bytes. such as(use UltraEdit): thread-index: AcQnkFqeXgt18I0GT6GWbyBE41XukA== MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Content-Class: urn:content-classes:message Importance: normal Priority: normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 SOS, please help me!
Replycorrections...
Posted by Ajay Sonawane on 04/07/2004 05:10amcorrections...
Posted by Ajay Sonawane on 04/07/2004 04:12am-
-
Replyits urgent, plz
Posted by shueb on 04/27/2004 02:16amthe code works great when i save the .mht file in my current directory but not on remote system. in the remote system it shows only text.please help me.
ReplyFunction finally fixed -- have working demo
Posted by keedo60 on 04/19/2004 01:02amI now have this code working on my Windows 2000 Professional system, using MS Visual C++. The errors generated by the compiler were caused by incorrect import statements. The correct statements are shown below. #import "c:\program files\common files\system\ado\msado15.dll" no_namespace raw_interfaces_only #import "C:\WINNT\system32\cdosys.dll" no_namespace raw_interfaces_only #include "C:\Program Files\Microsoft SDK\include\cdosysstr.h" #include "C:\Program Files\Microsoft SDK\include\cdosyserr.h" Make sure each import and include statment is on a single line. Also, note that these files may have different paths on your system. If you cannot find these files on your system, you will probably have to download the Platform SDK from Microsoft. You will need the Core SDK and the MDAC SDK component (this is the one with the CDO/ADO support). After installation you will find the header files in the include directory of the SDK folder. The cdosys.dll should be in your system folder. The rest of the code is OK, except you should uncomment the //return 1 statement, and add the required semicolon, so that you can evaluate the success or failure of the function upon return. Here is that code again: int CDBrowseView::SaveWholePage(CString szPageURL,CString szFileName) { CoInitialize(NULL); BSTR bstr=szPageURL.AllocSysString(); CString szUserName="domain\\username"; BSTR bstrUserName=szUserName.AllocSysString(); CString szPass="domain\\username"; BSTR bstrPass=szPass.AllocSysString(); IMessage *pMsg=NULL; IConfiguration* pConfig = NULL; _Stream* pStm = NULL; HRESULT hr=CoCreateInstance( __uuidof(Message), NULL, CLSCTX_INPROC_SERVER, __uuidof(IMessage), (void**)&pMsg); hr=CoCreateInstance( __uuidof(Configuration), NULL, CLSCTX_INPROC_SERVER, __uuidof(IConfiguration), (void**)&pConfig); //pMsg->Configuration = pConfig; pMsg->put_Configuration (pConfig); try { pMsg->CreateMHTMLBody( bstr, cdoSuppressNone, bstrUserName, bstrPass ); } catch(_com_error err) { // handle exception AfxMessageBox("Exception"); return 0; } _StreamPtr pStream; pMsg->GetStream(&pStm); pStm->SaveToFile( szFileName.AllocSysString(), adSaveCreateOverWrite); pMsg->Release(); pStm->Release(); CoUninitialize(); return 1; } Just as an aside, in the event that you want to use a Save As dialog to call the function from, here is the code for that: void CDBrowseView::OnHtmSave() { static char szFilter1[] = "MHT File (*.mht)|*.mht|Email File (*.eml)|*.eml||"; // FALSE gives you a file save dialog box, TRUE a file open CFileDialog m_FileDialog(FALSE, "mht", "*.mht", OFN_OVERWRITEPROMPT | OFN_HIDEREADONLY, szFilter1, NULL); if (m_FileDialog.DoModal() == IDOK) { // URL of web page to be saved CString szUrl = CDBrowseView::GetLocationURL(); // filename and path return from save as dialog CString szFile = m_FileDialog.GetPathName(); //call save page function int r = SaveWholePage(szUrl, szFile); // handle return code from function if(r == 1) AfxMessageBox("File Saved "); else AfxMessageBox("Save Failed "); } }ReplyError compiling sample code
Posted by NevF on 03/03/2004 03:36pmHi, This looks very good but I'm not getting very far trying to use it. When I compile the code I am getting various errors. First with: #import "c:\program files\common files\system\ado\msado15.dll" _ no_namespace rename("EOF", "EndOfFile") #import no_namespace rename("EOF", "EndOfFile") which doesn't look at all right I get: -------- --------------------Configuration: FMNLib - Win32 Debug-------------------- Compiling... FMNWebPageGrab.cpp D:\SAIG\FMNLib\FMNWebPageGrab.cpp(15) : warning C4185: ignoring unknown #import attribute '_' d:\saig\bin6\debug\fmnlib\msado15.tlh(409) : warning C4146: unary minus operator applied to unsigned type, result still unsigned d:\saig\bin6\debug\fmnlib\msado15.tlh(1317) : error C2629: unexpected 'short (' d:\saig\bin6\debug\fmnlib\msado15.tlh(1317) : error C2238: unexpected token(s) preceding ';' D:\SAIG\FMNLib\FMNWebPageGrab.cpp(16) : error C2146: syntax error : missing ';' before identifier 'rename' D:\SAIG\FMNLib\FMNWebPageGrab.cpp(16) : error C2501: 'no_namespace' : missing storage-class or type specifiers D:\SAIG\FMNLib\FMNWebPageGrab.cpp(16) : fatal error C1004: unexpected end of file found Error executing cl.exe. FMNWebPageGrab.obj - 5 error(s), 2 warning(s) -------- Changing to: #import "c:\program files\common files\system\ado\msado15.dll" no_namespace rename("EOF", "EndOfFile") and removing the second: #import no_namespace rename("EOF", "EndOfFile") I get: --------------------Configuration: FMNLib - Win32 Debug-------------------- Compiling... FMNWebPageGrab.cpp d:\saig\bin6\debug\fmnlib\msado15.tlh(407) : warning C4146: unary minus operator applied to unsigned type, result still unsigned D:\SAIG\FMNLib\FMNWebPageGrab.cpp(24) : error C2065: 'IMessagePtr' : undeclared identifier D:\SAIG\FMNLib\FMNWebPageGrab.cpp(24) : error C2146: syntax error : missing ';' before identifier 'iMsg' D:\SAIG\FMNLib\FMNWebPageGrab.cpp(24) : error C2065: 'iMsg' : undeclared identifier D:\SAIG\FMNLib\FMNWebPageGrab.cpp(24) : error C2065: 'Message' : undeclared identifier D:\SAIG\FMNLib\FMNWebPageGrab.cpp(25) : error C2065: 'IConfigurationPtr' : undeclared identifier D:\SAIG\FMNLib\FMNWebPageGrab.cpp(25) : error C2146: syntax error : missing ';' before identifier 'iConf' D:\SAIG\FMNLib\FMNWebPageGrab.cpp(25) : error C2065: 'iConf' : undeclared identifier D:\SAIG\FMNLib\FMNWebPageGrab.cpp(25) : error C2065: 'Configuration' : undeclared identifier D:\SAIG\FMNLib\FMNWebPageGrab.cpp(26) : error C2227: left of '->Configuration' must point to class/struct/union D:\SAIG\FMNLib\FMNWebPageGrab.cpp(29) : error C2227: left of '->CreateMHTMLBody' must point to class/struct/union D:\SAIG\FMNLib\FMNWebPageGrab.cpp(31) : error C2065: 'cdoSuppressNone' : undeclared identifier D:\SAIG\FMNLib\FMNWebPageGrab.cpp(39) : error C2227: left of '->GetStream' must point to class/struct/union Error executing cl.exe. FMNWebPageGrab.obj - 12 error(s), 1 warning(s) -------------- I don't understand why ADO is being used in this code? If possible I'd certainly prefer not to have to use ADO. I'd really like to be able to use your code and hope you are able to help. Thanks in advance. Neville-
ReplyADO is used because...
Posted by eshipman on 03/09/2004 05:10pmCreateMHTMLBody is a CDO call and CDO requires it.
ReplyFuck You!! Nighthawk Do it yourself zzz
Posted by Legacy on 12/15/2003 12:00amOriginally posted by: GREAT KOREAN
ReplyURGENT!!! : where is the C# source code?
Posted by Legacy on 12/11/2003 12:00amOriginally posted by: NightHawk
ReplyLoading, Please Wait ...