且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何在 Tensorflow.js 中保护(混淆/DRM)训练模型权重?

更新时间:2023-12-02 19:46:28

客户端代码混淆永远不会完全阻止它.改用服务器.

如果您的客户端应用程序包含模型,那么用户将能够以某种方式提取它.你可以让用户更难,但它永远是可能的.一些使其变得更难的技巧是:

If your client-side application contains the model, then the user will be able to somehow extract it. You can make it harder for the user, but it will always be possible. Some techniques to make it harder are:

  • 混淆您的代码:这样用户将无法阅读您的代码并轻松评论.根据您的构建工具,当您生成生产就绪"构建时,这可能已经为您完成.
  • 混淆库及其公共 API:即使您的代码被混淆,用户仍然可以通过查看库的公共 API 调用来猜测发生了什么.示例:在 model.predict 函数处设置断点并从那里开始调试代码是相当容易的.通过混淆库及其 API,这将变得更加困难.
  • 在您的代码中加入特殊检查":您还可以检查运行代码的页面是否是您的页面(例如,域是否匹配)等.您还想对此进行混淆代码也是如此.
  • Obfuscating your code: That way the user will not be able to read your code and comments easily. Depending on your build tools, this might already be done for you when you produce a "production ready" build.
  • Obfuscating the library and its public API: Even if your code is obfuscated, the user might still be able to guess what is going on by seeing the public API calls of the library. Example: It would be rather easy to set a break point at the model.predict function and debug your code from there on. By also obfuscating libraries and their API, this will become harder.
  • Put "special checks" in your code: You could also check if the page the code is running on is your page (e.g. if the domain matches), etc. You also want to obfuscate this code as well.

即使您的代码经过完美的混淆和良好的保护,您的客户端代码仍然在某处包含您的模型.使用这些方法总是可以以某种方式提取您的模型.

Even if your code is perfectly obfuscated and well protected, your client-side code still contains your model somewhere. With these methods it will always be possible to somehow extract your model.

为了使您的模型无法获得,您需要一种不同的方法.只把你的愚蠢的逻辑"放在客户端上.排除要保护的代码部分.相反,您在服务器上提供一个 API 来执行代码的受保护部分".

To make it impossible to get your model, you need a different approach. Only put your "dumb logic" on the client. Exclude the part of code that you want to protect. Instead you offer a API on your server that executes the "protected part" of your code.

这样,您无需在客户端运行 model.predict,而是向后端发出 AJAX 请求(带参数),然后返回结果.这样用户只能看到输入和输出,而不能提取模型本身.

This way, instead of running model.predict on the client-side, you would make an AJAX request to your backend (with the parameters) and then return the results. That way the user only sees the input and the output and cannot extract the model itself.

请记住,这意味着要做更多的工作,因为您不仅要为客户端应用程序编写代码,还要为服务器端应用程序编写代码,包括 API.根据您的应用程序的外观(例如:它是否有登录名?),这可能需要更多的代码.

Keep in mind that this means a lot more work, as you not only have to write the code for your client-side application but also for your server-side application, including the API. Depending on how your application looks like (e.g.: does it have a login?), this might be a lot more code.