guides:com:start

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
guides:com:start [2022-02-27 15:05] – ↷ Page moved and renamed from guides:com_object_reference to guides:com:start geekguides:com:start [2025-03-26 12:36] (current) – [IDispatch Objects] Demote header level geek
Line 1: Line 1:
-====== COM Object Reference ======+====== COM APIs ======
  
-[[ahk:com:htmlfile]] - Represents an HTML documentCan be used to readwriteinteract with HTML.+> Via [[https://learn.microsoft.com/en-us/windows/win32/com/the-component-object-model|Microsoft]] 
 +
 +> The Microsoft Component Object Model (COM) is a platform-independent, distributed, object-oriented system for creating binary software components that can interact. COM is the foundation technology for Microsoft's OLE (compound documents), ActiveX (Internet-enabled components), as well as others. 
 +
 +> To understand COM (and therefore all COM-based technologies), it is crucial to understand that it is not an object-oriented language but a standardNor does COM specify how an application should be structured; language, structure, and implementation details are left to the application developer. RatherCOM specifies an object model and programming requirements that enable COM objects (also called COM componentsor sometimes simply objects) to interact with other objects. These objects can be within a single process, in other processes, and can even be on remote computers. They can be written in different languages, and they may be structurally quite dissimilar, which is why COM is referred to as a binary standard; a standard that applies after a program has been translated to binary machine code.
  
-[[ahk:com:internetexplorer.application]] Explore Websites.+COM (the Component Object Model) is an object-oriented standard for designing objects (collections of function code called methods, and variable-like data called properties) to be accessible and usable across different programming languagesCOM was designed by Microsoft and is reliant on the Windows API to function. Following the COM design, an application registers a "Component" with Windows that third-party applications can call on to retrieve an instance of a COM Object. When interacting with that object, the Windows API will automatically handle the communication between your application and the third-party software, allowing you to treat the object like a native object and not worry about inter-process communication (IPC).
  
-[[ahk:com:msxml2]] DOMDocument 6.0 XML parser (v6.0 requires XP SP3 or newer see here for compatability with older XP).+A COM component registered by a software package provides one or more globally unique identifiers (GUIDs), which are a 128-bit numbers formatted in hex like ''123e4567-e89b-12d3-a456-426614174000''. A GUID identifying an entire component is called a CLSID (Class Identifier)A GUID identifying a single interface implemented offered by the component is called an IID (Interface Identifier). The CLSID is registered with Windows, sometimes along with a human-readable program ID such as ''InternetExplorer.Application'', ''WinHttp.WinHttpRequest.5.1'', ''WScript.Shell'', or ''AutoCAD.Application''. Windows keeps the list of CLSIDs in the Registry, so that they can be looked up later. That CLSID, or when available the human-readable program ID, is specified by your own application when asking Windows for an instance object representing one of those registered components. Upon this request, Windows will handle loading any third-party code transparently, allowing you to use the object without worrying about implementation details like DLL management.
  
-[[ahk:com:scriptcontrol]] Dynamically execute VBScript or JScript.+When your application retrieves an object from a COM component, that object can either connect to a new instance of the third-party code, or connect to an existing instance of the third-party code. For example, a COM Object registered by Microsoft Office could be requested to allow your application to manipulate documents in the background independently of any running Office applications. Or it could connect to a running application to allow manipulation of the document in a visible application window.
  
-[[ahk:com:scripting.dictionary]] - Object that stores data keyitem pairs.+Because of its flexibility, COM forms the basis of OLE (Object Linking and Embedding), ActiveX, Active Scriptingand even DirectX.
  
-[[ahk:com:scripting.filesystemobject]] - Access Files & Folders+OLEUnder OLE, COM facilitates the embedding of objects representing documents of one type, within another type of document. For example, embedding a piece of media in a slideshow. Or embedding a spreadsheet in a text document.
  
-[[ahk:com:shell.application]] - Access Explorer & IE Windows/Tabs; Open & Manipulate Windows.+ActiveXUnder ActiveX, COM facilitates the embedding of software within other software. For example, embedding a Java applet or a Flash player within a web browserOr embedding a web browser within another desktop application.
  
-[[ahk:com:shell.explorer]] - Embed an Explorer/Browser Control in Gui (Internet Explorer Trident Browser)+[[https://en.wikipedia.org/wiki/Active_Scripting|Active Scripting]]: Originally known as ActiveX Scripting, Active Scripting is a framework for developing scripting languages to take advantage of powerful high-level COM interfaces. In the early days of the web, this would have allowed scripting languages other than JavaScript to be embedded into web page. Today, it powers Microsoft Office's VBScript macros. In many aspects, AutoHotkey is considered to be an Active Scripting language. Its objects build off the COM base, allowing them to be passed seamlessly back and forth between AutoHotkey and third-party COM components.
  
-[[ahk:com:sapi.spvoice]] - Speech API: SpVoice - Text-to-Speech (TTS)+In short, COM was Microsoft's primary solution for communication between software packages in the years before leaning into before their .NET Common Language Runtime. A lot of the Windows API and third-party software still supports COM interfaces, and utilizing those interfaces will allow you to do amazing things. AutoHotkey, especially AutoHotkey v2, has a variety of tools for interacting with those interfaces, if only you take the time to learn how to use them.
  
-[[ahk:com:vbscript.regexp]] - VBS Regular Expressions (including global match)+===== Anatomy of a COM Object =====
  
-Windows Media Player - Play Media Files; Embed WMPlayer Control in GUI.+A COM Object follows the %%C++%% [[https://en.wikipedia.org/wiki/Application_binary_interface|ABI]] for objects. COM objects are composed of structured data, and what is known as virtual method table (vtable).
  
-[[ahk:com:winhttp.winhttprequest]] - Provides simple HTTP client functionalityallowing much more control than UrlDownloadToFile.+The virtual method table is an array of pointers to %%__stdcall%% functionsThey are arranged in the order they are declared in headers. Each method is implemented by a regular function where the first parameter is "This"a pointer to the structured data of the objectAll COM objects derive from the IUnknown interface, so a basic IUnknown-compatible COM object's vtable would look like this:
  
-[[ahk:com:winmgmt]] Get System InformationManage Windows Services+<code c> 
 +// Interface Identifier (IID) {00000000-0000-0000-C000-000000000046} 
 +typedef struct IUnknownVtbl { 
 + __stdcall HRESULT(*QueryInterface)(IUnknown *This, ...); // From IUnknown 
 + __stdcall ULONG(*AddRef)(IUnknown *This, ...); // From IUnknown 
 + __stdcall ULONG(*Release)(IUnknown *This, ...); // From IUnknown 
 +} IUnknownVtbl; 
 +</code>
  
-[[ahk:com:wscript.shell]] - Various Administration Tasks (many native AHK tasks)+And an //object// with this interface would look like this:
  
 +<code c>
 +typedef struct IUnknown {
 + IUnknownVtbl* vtbl;
 + ... // any data fields go here
 +} IUnknown;
 +</code>
  
-**MS Office Applications**+So when you have a (pointer to a) COM object ''pObject'' of type IUnknown, you could call its method "QueryInterface" by: 
 +  - Retrieving the vtable: ''pObjectVtbl := NumGet(pObject, 0, "Ptr")'' 
 +  - Retrieving the function reference at index ''0'': ''pObjectQueryInterface := NumGet(pObjectVtbl,A_PtrSize, "Ptr")'' 
 +  - Calling the function passing the object as the first parameter: ''DllCall(pObjectQueryInterface, "Ptr", pObject, ...)'' 
 +(or in AHKv2, by using ComCall which performs all those steps for you)
  
-[[ahk:com:excel.application]]+IUnknown is the most basic of COM Object interfaces, but to perform useful work it is typically necessary to work with objects that //extend// IUnknown, such as IDispatchWith an interface that extends another, the vtable will start with the functions from the original interface and then continue into the new extended functions. For IDispatch, this means its vtable would look like this:
  
-[[ahk:com:outlook.application]]+<code c> 
 +// Interface Identifier (IID) {00020400-0000-0000-C000-000000000046} 
 +typedef struct IDispatchVtbl { 
 + // From IUnknown 
 + __stdcall HRESULT(*QueryInterface)(IDispatch *This, ...); 
 + __stdcall ULONG(*AddRef)(IDispatch *This); 
 + __stdcall ULONG(*Release)(IDispatch *This);
  
-[[ahk:com:powerpoint.application]]+ // From IDispatch 
 + __stdcall HRESULT(*GetTypeInfoCount)(IDispatch *This, ...); 
 + __stdcall HRESULT(*GetTypeInfo)(IDispatch * This, ...); 
 + __stdcall HRESULT(*GetIDsOfNames)(IDispatch *This, ...); 
 + __stdcall HRESULT(*Invoke)(IDispatch *This, ...); 
 +} IDispatchVtbl; 
 +</code>
  
-[[ahk:com:word.application]]+Therefore, the indexes of the IDispatch methods in the vtable start ''3'' not ''0''. This is very important to keep in mind when looking for indexes from headers posted online. For example, it is often helpful to perform Google searches such as ''IDispatchVtbl filetype:h'' to find header files [[https://github.com/tpn/winsdk-10/blob/master/Include/10.0.16299.0/um/OAIdl.h#L2242|like this one]]. Instead of showing that it begins with the IUnknown functions, it just has the text ''BEGIN_INTERFACE'' which, while it's likely easier to write and manage, it is not very useful to us the readers.
  
 +===== IDispatch Objects =====
 +
 +The IDispatch interface is Microsoft's "automation" interface, designed to allow easy integration with automation languages like Visual Basic and VBScript. Rather than following a strict structure, objects implementing the IDispatch interface only implement four additional methods on top of IUnknown's reference counter methods:
 +
 +  * (Optional) GetTypeInfoCount - Get the count of "TypeInfo" entries
 +  * (Optional) GetTypeInfo - Get a list of TypeInfo entries that describe object properties
 +  * GetIDsOfNames - Turns text names into property IDs at run-time
 +  * Invoke - Accesses a property by ID, either retrieving, setting, or calling the property as a method
 +
 +From these four methods, IDispatch allows rigidly structured languages like C++ to create or access free-form objects where the properties may not all be known at compile time. AutoHotkey itself uses IDispatch as the basis for all its objects, and handles accessing IDispatch properties transparently with regular object syntax.
 +
 +<tabbox Native Syntax>
 +
 +<code autohotkey>
 +#Requires AutoHotkey v2
 +
 +; Retrieve a WScript.Shell IDispatch object using its human-readable ProgID.
 +; You could also specify CLSID "{72C24DD5-D70A-438B-8A42-98424B88AFB8}" instead.
 +shell := ComObject("WScript.Shell")
 +
 +; This call first invokes GetIDsOfNames to convert "Exec" into a property ID,
 +; then it calls Invoke with that ID, specifying this should be a method call
 +; with the given parameter "calc.exe".
 +shell.Exec("calc.exe")
 +</code>
 +
 +<tabbox ComCall Syntax>
 +
 +<code autohotkey>
 +#Requires AutoHotkey v2
 +
 +; Retrieve a WScript.Shell IDispatch object using its human-readable ProgID.
 +; You could also specify CLSID "{72C24DD5-D70A-438B-8A42-98424B88AFB8}" instead.
 +shell := ComObject("WScript.Shell")
 +
 +name := "Exec"
 +arg1 := "calc.exe"
 +
 +; Retreive the ID for method "Exec"
 +IID_NULL := Buffer(16, 0)
 +names := Buffer(A_PtrSize * 1, 0)
 +NumPut("Ptr", StrPtr(name), names)
 +ids := Buffer(16 * 1, 0)
 +ComCall(5, ComObjValue(shell), ; shell.GetIDsOfNames
 + "Ptr", IID_NULL, ; REFIID   riid
 + "Ptr", names,    ; LPOLESTR *rgszNames
 + "UInt", 1,       ; UINT     cNames
 + "Ptr", 0,        ; LCID     lcid
 + "Ptr", ids,      ; DISPID   *rgDispId
 + "Int" ; HRESULT
 +)
 +execId := NumGet(ids, "Int")
 +
 +; Stage the arguments for the call
 +args := Buffer((8+A_PtrSize*2) * 1, 0) ; one argument
 +NumPut(
 + "Short", 8,          ; VARTYPE vt = VT_BSTR
 + "Short", 0,          ; WORD wReserved1
 + "Short", 0,          ; WORD wReserved2
 + "Short", 0,          ; WORD wReserved3
 + "Ptr", StrPtr(arg1), ; BSTR bstrVal = arg1
 + args
 +)
 +dp := Buffer(A_PtrSize*2+8, 0)
 +NumPut(
 + "Ptr", args.Ptr, ; VARIANTARG *rgvarg
 + "Ptr", 0,        ; DISPID     *rgdispidNamedArgs
 + "UInt", 1,       ; UINT       cArgs
 + "UInt", 0,       ; UINT       cNamedArgs
 + dp
 +)
 +
 +; Call "Exec" with those arguments
 +res := Buffer((8+A_PtrSize*2) * 1, 0) ; one result
 +ComCall(6, ComObjValue(shell), ; shell.Invoke
 + "Int", execId,   ; DISPID     dispIdMember - The member to invoke
 + "Ptr", IID_NULL, ; REFIID     riid
 + "Ptr", 0,        ; LCID       lcid
 + "Int", 1,        ; WORD       wFlags = DISPATCH_METHOD
 + "Ptr", dp,       ; DISPPARAMS *pDispParams
 + "Ptr", res,      ; VARIANT    *pVarResult
 + "Ptr", 0,        ; EXCEPINFO  *pExcepInfo
 + "Ptr", 0,        ; UINT       *puArgErr
 + "Int" ; HRESULT
 +)
 +</code>
 +
 +</tabbox>
 +
 +----
 +
 +[[https://www.autohotkey.com/board/topic/56987-com-object-reference-autohotkey-l/|COM Object Reference [AutoHotkey v1.1+] (archived forum)]]
 +
 +[[https://redd.it/y1b8ht|Inspection of IDispatch COM objects using Powershell]]