Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm-interp: Add multi-module support #2009

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
182 changes: 163 additions & 19 deletions src/tools/wasm-interp.cc
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
#include <cassert>
#include <cstdio>
#include <cstdlib>
#include <map>
#include <memory>
#include <string>
#include <vector>
Expand All @@ -39,6 +40,9 @@
using namespace wabt;
using namespace wabt::interp;

using ExportMap = std::map<std::string, Extern::Ptr>;
using Registry = std::map<std::string, ExportMap>;
sbc100 marked this conversation as resolved.
Show resolved Hide resolved

static int s_verbose;
static const char* s_infile;
static Thread::Options s_thread_options;
Expand All @@ -48,6 +52,7 @@ static bool s_host_print;
static bool s_dummy_import_func;
static Features s_features;
static bool s_wasi;
static std::vector<std::string> s_modules;
static std::vector<std::string> s_wasi_env;
static std::vector<std::string> s_wasi_argv;
static std::vector<std::string> s_wasi_dirs;
Expand All @@ -56,6 +61,8 @@ static std::unique_ptr<FileStream> s_log_stream;
static std::unique_ptr<FileStream> s_stdout_stream;
static std::unique_ptr<FileStream> s_stderr_stream;

static Registry s_registry;

static Store s_store;

static const char s_description[] =
Expand Down Expand Up @@ -110,6 +117,13 @@ static void ParseOptions(int argc, char** argv) {
parser.AddOption(
'd', "dir", "DIR", "Pass the given directory the the WASI runtime",
[](const std::string& argument) { s_wasi_dirs.push_back(argument); });
parser.AddOption(
'm', "module", "[(@|$)NAME:]FILE",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can come up with better name for this parameter? How about something like --preload?

Also perhaps we could just get away without adding a new option here at all and just treat these module like main binary we are running. i.e could we just allow wasm-interp to take any number of binaries as arguments rather than the current limit of just one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wanted module to be used for the import/exprot because thats what its called in the docs of WebAssembly. --preload would not be to far fetched, but I still think --module (or just --mod) would still be best. (You can choose, just wanted to give my reasoning).

Sadly if we want it to interoperate with --wasi (and as long as we don't have a glue for native libraries) a new option is a must.
I was thinking about multiple different ways, but the OptionParser does not really like them. I cannot specify multiple modules, then a main one, and then optional parameters without some new option to mark the main module. I guessed -m for conveniance á la '-L"path" -llibname' would be best and in the end just stuck to the -m/--module to be more like the other options available.

I was thinking about something to glue native libraries and just have --wasi be an indicator which native library to give the optional arguments to.

references:
https://webassembly.github.io/spec/core/valid/modules.html#imports
https://webassembly.github.io/spec/core/text/modules.html#imports
https://webassembly.github.io/spec/core/syntax/modules.html#imports
https://webassembly.github.io/spec/core/binary/modules.html#import-section

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still a new option, but if I waht you want, this would be better suited:
we could have an -a, --arg parameter right before the arguments. I just don't know how to make it optional and still have a list of arguments following it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know now why I did this, only the main module has access to --wasi the other ones don't.
I first need to figure out how to provide wasi support to all of them, then how to do the --arg thingy and then I don't mind.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speaking of -L.. I wonder if we could have wasi-interp automatically looks for wasm files in certain locations and load them on demand to satisfy imports?

We could even do something fancy like JS import maps: https://github.com/WICG/import-maps

But all that is farther down the line after we land this initial support.

"load module FILE and provide all exports under NAME for upcoming "
"imports. if NAME is empty, the preference will be used. "
/*"$ prefers the debug-name in the name section, "*/
"@ prefers file name (without extension).",
[](const std::string& argument) { s_modules.push_back(argument); });
parser.AddOption(
"run-all-exports",
"Run all the exported functions, in order. Useful for testing",
Expand Down Expand Up @@ -161,33 +175,126 @@ Result RunAllExports(const Instance::Ptr& instance, Errors* errors) {
return result;
}

static void BindImports(const Module::Ptr& module, RefVec& imports) {
static bool IsHostPrint(const ImportDesc& import) {
return import.type.type->kind == ExternKind::Func &&
((s_host_print && import.type.module == "host" &&
import.type.name == "print") ||
s_dummy_import_func);
}

static Ref GenerateHostPrint(const ImportDesc& import) {
auto* stream = s_stdout_stream.get();

auto func_type = *cast<FuncType>(import.type.type.get());
auto import_name = StringPrintf("%s.%s", import.type.module.c_str(),
import.type.name.c_str());

auto host_func = HostFunc::New(
s_store, func_type,
[=](Thread& thread, const Values& params, Values& results,
Trap::Ptr* trap) -> Result {
printf("called host ");
WriteCall(stream, import_name, func_type, params, results, *trap);
return Result::Ok;
});

return host_func.ref();
}

static Extern::Ptr GetImport(const std::string& module,
const std::string& name) {
auto mod_iter = s_registry.find(module);
if (mod_iter != s_registry.end()) {
auto extern_iter = mod_iter->second.find(name);
if (extern_iter != mod_iter->second.end()) {
return extern_iter->second;
}
}
return {};
}

static void PopulateExports(const Instance::Ptr& instance,
Module::Ptr& module,
ExportMap& map) {
map.clear();
// Module::Ptr module{s_store, instance->module()};
for (size_t i = 0; i < module->export_types().size(); ++i) {
const ExportType& export_type = module->export_types()[i];
auto export_ = s_store.UnsafeGet<Extern>(instance->exports()[i]);
map[export_type.name] = export_;
}
}

static std::string GetPathName(std::string module_arg) {
size_t path_at = module_arg.find_first_of(':');

path_at = path_at == std::string::npos ? 0 : path_at + 1;

return module_arg.substr(path_at);
}

static std::string FileNameOfPath(std::string path_name) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can use GetBasename + StripExtension from wabt/filenames.h?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the notice. I did not even know that exists already.

// use file_name (without extension)
size_t fstart = path_name.find_last_of("/\\");
size_t fend = path_name.find_last_of(".");

fstart = fstart == std::string::npos ? 0 : fstart;
fend = fend < fstart ? std::string::npos : fend;

return path_name.substr(fstart, fend);
}

static std::string GetRegistryName(std::string module_arg,
const Module::Ptr& module) {
size_t split_pos = module_arg.find_first_of(':');
if (split_pos == std::string::npos) {
split_pos = 0;
}
std::string override_name = module_arg.substr(0, split_pos ? split_pos : 0);
std::string path_name = module_arg.substr(split_pos);
std::string debug_name = ""; // GetDebugName(module);

// check if override_name starts with @ and return override_name
if (override_name[0] == '@') {
if (override_name.size() > 1) {
return override_name.substr(1);
}
// ignore debug-name and use filename
return FileNameOfPath(path_name);
}

// if no override_name is provided -> use debug name or filename
if (override_name.empty()) {
// prefer debug_name, if no override_name is provided
if (!debug_name.empty()) {
return debug_name;
}
// use file_name (without extension)
return FileNameOfPath(path_name);
}

// prefer debug_name if override_name starts with '$'
if (override_name[0] == '$' && !debug_name.empty()) {
return debug_name;
}

// return the override_name (remove leading '$' if present)
bool leading_dollar = override_name[0] == '$';
return override_name.substr(leading_dollar ? 1 : 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this dollar sign syntax? What about this simpler scheme:

  1. Override name, if present
  2. Debug name, if present
  3. Fall back to file name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need something to explicitly go for the debug name. I dont want something unexpected. Like downloading a module from the web, and not have it use the filename, because you might not check for the debug name. (just for convenience)

I came up with

  1. override_name if present
  2. if override_name == "$" debug_name if present
  3. Fall back to file_name
  if (!override_name.empty()) {
    // prefer override_name unless its just "$"
    if (override_name != "$") {
      return override_name;
    }
    
    // prefer debug_name if override_name == "$"
    if (!debug_name.empty()) {
      return debug_name;
    }
  }
  
  // fall back to file-name
  return FileNameOfPath(path_name);

Copy link
Contributor Author

@tDwtp tDwtp Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or we go for this:

  1. if override_name == "@" use file_name
  2. overriide_name if present
  3. debug_name if present
  4. Fall back to file_name

Binary compatibility would then be prefered

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just don't think we need a way to force the file name to be used over the debug name. If you really want to first the filename why not just use the override mechism. e.g -m libfoo:libfoo.wasm.

In other words, always prefer the debug name, but give users an opt out to override if they want to. That would avoid the magic @ thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was just for conveniance... But tbh I don't care to much.
Its done already. I removed the magic thingy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I also can't find a way to get the DebugName, my best guess was it's hidden somewhere in Instance, but I cant find it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbc100 minor question, wouldn't it make sense to just ... use all three? Like: Always register by file_name, if exists register by debug_name, if exists register by override_name.
This could be an idea for a revision. And as soon as I findout how to get debug_name. (It seems like I need to query it manually from the file? The other approaches used in other files are ... complicated)

}

static void BindImports(const Module::Ptr& module, RefVec& imports) {
for (auto&& import : module->desc().imports) {
if (import.type.type->kind == ExternKind::Func &&
((s_host_print && import.type.module == "host" &&
import.type.name == "print") ||
s_dummy_import_func)) {
auto func_type = *cast<FuncType>(import.type.type.get());
auto import_name = StringPrintf("%s.%s", import.type.module.c_str(),
import.type.name.c_str());

auto host_func = HostFunc::New(
s_store, func_type,
[=](Thread& thread, const Values& params, Values& results,
Trap::Ptr* trap) -> Result {
printf("called host ");
WriteCall(stream, import_name, func_type, params, results, *trap);
return Result::Ok;
});
imports.push_back(host_func.ref());
if (IsHostPrint(import)) {
imports.push_back(GenerateHostPrint(import));
continue;
}

// By default, just push an null reference. This won't resolve, and
// instantiation will fail.
imports.push_back(Ref::Null);

Extern::Ptr extern_ = GetImport(import.type.module, import.type.name);
imports.push_back(extern_ ? extern_.ref() : Ref::Null);
}
}

Expand Down Expand Up @@ -239,6 +346,37 @@ static Result ReadAndRunModule(const char* module_filename) {

RefVec imports;

// if we only have dummy imports, we wont need to reghister anything
// but we still want to load them.
std::vector<Module::Ptr> modules_loaded(s_modules.size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we unify the loading of the main module_filename above with the loading of s_modules here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about it early, but dropped it, because I did not understand how it would make it work.
Probably why I have that vector in the first place.
Now on my TODO. After I figure out how to do the module option thingy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I add a Divider to Option Parser?

  parser.AddArgument("filename", OptionParser::ArgumentCount::OneOrMore, [...]);
  // parser.AddDivider("--", OptionParser::Divider::Optional);
  parser.AddArgument("arg", OptionParser::ArgumentCount::ZeroOrMore, [...]);

This way I could distinguish between modules and wasi arguments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think using -- to distinguish between arguments passed to wasi and argument to wasm-interp itself is good idea.

In general I'm a fan of the direction of this PR, and I think we could probably even land with without too much more iteration, but I would also like to be able to iterate on the ergonomics of it. But perhaps we can do a lot of that iteration after first landing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well: Off I go to find out how to implement that... Possibly wihtout implementing a whole Divider thing.

for (auto import_module : s_modules) {
ExportMap load_map;
std::string module_path = GetPathName(import_module);

Module::Ptr load_module;
result = ReadModule(module_path.c_str(), &errors, &load_module);
if (!Succeeded(result)) {
FormatErrorsToFile(errors, Location::Type::Binary);
return result;
}

RefVec load_imports;
BindImports(load_module, load_imports);

// this is how wasm-interp.exe does it
Instance::Ptr load_instance;
CHECK_RESULT(InstantiateModule(load_imports, load_module, &load_instance));

// we wont need to register anything, if we only have dummy imports anyway
if (!s_dummy_import_func) {
std::string reg_name = GetRegistryName(import_module, load_module);
PopulateExports(load_instance, load_module, load_map);
s_registry[reg_name] = std::move(load_map);
}

modules_loaded.push_back(load_module);
}

#if WITH_WASI
uvwasi_t uvwasi;
#endif
Expand Down Expand Up @@ -327,6 +465,12 @@ int ProgramMain(int argc, char** argv) {
s_store.setFeatures(s_features);

wabt::Result result = ReadAndRunModule(s_infile);

for (auto& pair : s_registry) {
pair.second.clear();
}
s_registry.clear();
sbc100 marked this conversation as resolved.
Show resolved Hide resolved

return result != wabt::Result::Ok;
}

Expand Down
1 change: 1 addition & 0 deletions test/help/wasm-interp.txt
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ options:
--wasi Assume input module is WASI compliant (Export WASI API the the module and invoke _start function)
-e, --env=ENV Pass the given environment string in the WASI runtime
-d, --dir=DIR Pass the given directory the the WASI runtime
-m, --module=[(@|$)NAME:]FILE load module FILE and provide all exports under NAME for upcoming imports. if NAME is empty, the preference will be used. @ prefers file name (without extension).
--run-all-exports Run all the exported functions, in order. Useful for testing
--host-print Include an importable function named "host.print" for printing to stdout
--dummy-import-func Provide a dummy implementation of all imported functions. The function will log the call and return an appropriate zero value.
Expand Down