使用 VBA 将本地 HTML 文件读入字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18286598/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-29 12:15:51  来源:igfitidea点击:

Read Local HTML File into String With VBA

htmlvbaexcel-vbaioexcel

提问by ebrts

This feels like it should be simple. I have a .HTML file stored on my computer, and I'd like to read the entire file into a string. When I try the super straightforward

这感觉应该很简单。我的计算机上存储了一个 .HTML 文件,我想将整个文件读入一个字符串。当我尝试超级直截了当

Dim FileAsString as string 

Open "C:\Myfile.HTML" for input as #1
Input #1, FileAsString
Close #1

debug.print FileAsString

I don't get the whole file. I only get the first few lines (I know the immediate window cuts off, but that's not the issue. I'm definitely not getting the whole file into my string.) I also tried using an alternative method using the file system object, and got similar results, only this time with lots of weird characters and question marks thrown in. This makes me think it's probably some kind of encoding issue. (Although frankly, I don't fully understand what that means. I know there are different encoding formats and that this can cause issues with string parsing, but that's about it.)

我没有得到整个文件。我只得到前几行(我知道即时窗口被切断,但这不是问题。我绝对不会将整个文件放入我的字符串中。)我还尝试使用使用文件系统对象的替代方法,并且得到了类似的结果,只是这次加入了很多奇怪的字符和问号。这让我觉得这可能是某种编码问题。(虽然坦率地说,我不完全理解这意味着什么。我知道有不同的编码格式,这可能会导致字符串解析出现问题,但仅此而已。)

So more generally, here's what I'd really like to know: How can I use vba to open a file of any extension (that can be viewed in a text editor) and length (that's doesn't exceed VBA's string limit), and be sure that whatever characters I would see in a basic text editor are what gets read into a string? (If that can't be (easily) done, I'd certainly appreciate being pointed towards a method that's likely to work with .html files) Thanks so much for your help

所以更一般地说,这是我真正想知道的:如何使用 vba 打开任何扩展名(可以在文本编辑器中查看)和长度(不超过 VBA 的字符串限制)的文件,以及确定我在基本文本编辑器中看到的任何字符都会被读入字符串?(如果这不能(轻松)完成,我当然会很感激被指出一种可能适用于 .html 文件的方法)非常感谢您的帮助

EDIT: Here's an example of what happens when I use the suggested method. Specifically

编辑:这是我使用建议的方法时发生的情况的示例。具体来说

    Dim oFSO As Object
    Dim oFS As Object, sText As String

    Set oFSO = CreateObject("Scripting.FileSystemObject")
    Set oFS = oFSO.OpenTextFile(Path)

    Do Until oFS.AtEndOfStream
        sText = oFS.ReadAll()
    Loop
    FileToString = sText

    Set oFSO = Nothing
    Set oFS = Nothing

End Function

I'll show you both the beginning (via a message box) and the end (via the immediate window) because both are weird in different ways. In both cases I'll compare it to a screen capture of the html source displayed in chrome:

我将向您展示开头(通过消息框)和结尾(通过即时窗口),因为两者在不同方面都很奇怪。在这两种情况下,我都会将其与 chrome 中显示的 html 源的屏幕截图进行比较:

Beginning: enter image description here

开始: 在此处输入图片说明

enter image description here

在此处输入图片说明

End: enter image description here

结尾: 在此处输入图片说明

enter image description here

在此处输入图片说明

采纳答案by ebrts

Okay so I finally managed to figure this out. The VBA file system object can only read asciiII files, and I had saved mine as unicode. Sometimes, as in my case, saving an asciiII file can cause errors. You can get around this, however, by converting the file to binary, and then back to a string. The details are explained here http://bytes.com/topic/asp-classic/answers/521362-write-xmlhttp-result-text-file.

好的,所以我终于设法解决了这个问题。VBA 文件系统对象只能读取 asciiII 文件,我已将我的文件保存为 unicode。有时,就我而言,保存 asciiII 文件可能会导致错误。但是,您可以通过将文件转换为二进制文件,然后再转换回字符串来解决此问题。详细信息在这里解释http://bytes.com/topic/asp-classic/answers/521362-write-xmlhttp-result-text-file

回答by osknows

This is one method

这是一种方法

Option Explicit

    Sub test()

    Dim oFSO As Object
    Dim oFS As Object, sText As String

    Set oFSO = CreateObject("Scripting.FileSystemObject")
    Set oFS = oFSO.OpenTextFile("C:\Users\osknows\Desktop\import-store.csv")

    Do Until oFS.AtEndOfStream
    ' sText = oFS.ReadLine 'read line by line
    sText = oFS.ReadAll()
    Debug.Print sText
    Loop
    End Sub

EDIT:

编辑:

Try changing the following line to one of the following 3 lines and see if it makes any difference

尝试将以下行更改为以下 3 行之一,看看它是否有任何区别

http://msdn.microsoft.com/en-us/library/aa265347(v=vs.60).aspx

http://msdn.microsoft.com/en-us/library/aa265347(v=vs.60).aspx

Set FS = FSO.OpenTextFile("C:\Users\osknows\Desktop\import-store.csv", 1, 0)
Set FS = FSO.OpenTextFile("C:\Users\osknows\Desktop\import-store.csv", 1, 1)
Set FS = FSO.OpenTextFile("C:\Users\osknows\Desktop\import-store.csv", 1, 2)

EDIT2:

编辑2:

Does this code work for you?

这些代码在你那正常吗?

Function ExecuteWebRequest(ByVal url As String) As String

    Dim oXHTTP As Object

    Set oXHTTP = CreateObject("MSXML2.XMLHTTP")
    oXHTTP.Open "GET", url, False
    oXHTTP.send
    ExecuteWebRequest = oXHTTP.responseText
    Set oXHTTP = Nothing

End Function

Function OutputText(ByVal outputstring As String)
    MyFile = ThisWorkbook.Path & "\temp.html"
    'set and open file for output
    fnum = FreeFile()
    Open MyFile For Output As fnum
    'use Print when you want the string without quotation marks
    Print #fnum, outputstring
    Close #fnum
End Function

Sub test()
Dim oFSO As Object
Dim oFS As Object, sText As String
Dim Uri As String, HTML As String

    Uri = "http://www.forrent.com/results.php?search_type=citystate&page_type_id=city&seed=859049165&main_field=12345&ssradius=-1&min_price=%240&max_price=No+Limit&sbeds=99&sbaths=99&search-submit=Submit"
    HTML = ExecuteWebRequest(Uri)
    OutputText (HTML)

    Set oFSO = CreateObject("Scripting.FileSystemObject")
    Set oFS = oFSO.OpenTextFile(ThisWorkbook.Path & "\temp.html")

    Do Until oFS.AtEndOfStream
    ' sText = oFS.ReadLine 'read line by line
    sText = oFS.ReadAll()
    Debug.Print sText
    Loop

End Sub

enter image description here

在此处输入图片说明

回答by Alex L

A bit late to answer but I did this exact thing today (works perfectly):

回答有点晚,但我今天做了这件事(完美无缺):

Sub modify_local_html_file()
    Dim url As String
    Dim html As Object
    Dim fill_a As Object

    url = "C:\Myfile.HTML"

    Dim oFSO As Object
    Dim oFS As Object, sText As String

    Set oFSO = CreateObject("Scripting.FileSystemObject")
    Set oFS = oFSO.OpenTextFile(url)

    Do Until oFS.AtEndOfStream
        sText = oFS.ReadAll()
        Debug.Print sText
    Loop

    Set html = CreateObject("htmlfile")
    html.body.innerHTML = sText

    oFS.Close
    Set oFS = Nothing

    '# grab some element #'
    Set fill_a = html.getElementById("val_a")

    MsgBox fill_a.innerText

    '# change its inner text #'
    fill_a.innerText = "20%"

    MsgBox fill_a.innerText

    '# open file this time to write to #'
    Set oFS = oFSO.OpenTextFile(url, 2)

    '# write it modified html #'
    oFS.write html.body.innerHTML
    oFS.Close

    Set oFSO = Nothing
    Set oFS = Nothing

End Sub