前端实现PDF标注

首页

文档扫描与图像处理

前端实现PDF标注

2024年9月3日

分类：文档扫描

Dynamsoft Document Viewer是为文档图像提供查看、管理功能的SDK。v2.0新增了标注功能，可以结合以下SDK构建文档扫描方案：

Dynamic Web TWAIN：在浏览器中访问文档扫描仪。
Dynamsoft Document Normalizer：检测文档边界。

在本文中，我们将讨论它提供了哪些标注、有什么使用案例，以及代码展示。

具有文档扫描和标注功能的在线demo

支持的标注

目前，支持形状、画笔、文本、图像等标注。

形状：

矩形
椭圆
多边形
折线
直线

文本：

文本框
TextTypewriter

其他：

墨水（自由画笔）
印章（自定义图像）

一些用例

使用标注功能，我们可以实现文档管理中的一些常见任务。

添加评论和高亮。
添加条形码或二维码用于文档标识。
遮盖敏感信息，如联系人的姓名或照片。
通过绘制标记来校对手稿。
添加用于指定文档状态的印章。

实现代码

下面是用于添加标记的相关代码。

添加Dynamsoft Document Viewer

要在网页中使用Dynamsoft Document Viewer，需要包含以下文件：

<script src="https://cdn.jsdelivr.net/npm/dynamsoft-document-viewer@latest/dist/ddv.js"></script>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/dynamsoft-document-viewer@latest/dist/ddv.css">

然后用以下代码进行初始化：

Dynamsoft.DDV.Core.license = "DLS2eyJoYW5kc2hha2VDb2RlIjoiMjAwMDAxLTE2NDk4Mjk3OTI2MzUiLCJvcmdhbml6YXRpb25JRCI6IjIwMDAwMSIsInNlc3Npb25QYXNzd29yZCI6IndTcGR6Vm05WDJrcEQ5YUoifQ=="; // Public trial license which is valid for 24 hours
Dynamsoft.DDV.Core.engineResourcePath = "https://cdn.jsdelivr.net/npm/dynamsoft-document-viewer@latest/dist/engine";// Lead to a folder containing the distributed WASM files
await Dynamsoft.DDV.Core.loadWasm();
await Dynamsoft.DDV.Core.init();

可以在此处申请许可证。

创建用于标注的Edit Viewer

我们需要创建一个Edit Viewer实例来查看和添加标注。

const editViewerUiConfig = {
    type: Dynamsoft.DDV.Elements.Layout,
    flexDirection: "column",
    className: "ddv-edit-viewer-mobile",
    children: [
        {
            type: Dynamsoft.DDV.Elements.Layout,
            className: "ddv-edit-viewer-header-mobile",
            children: [
                Dynamsoft.DDV.Elements.Pagination,
                {
                    type: Dynamsoft.DDV.Elements.Button,
                    className: "ddv-button-done",
                    events:{
                        click: "close"
                    }
                }
            ],
        },
        Dynamsoft.DDV.Elements.MainView,
        {
            type: Dynamsoft.DDV.Elements.Layout,
            className: "ddv-edit-viewer-footer-mobile",
            children: [
                Dynamsoft.DDV.Elements.DisplayMode,
                Dynamsoft.DDV.Elements.RotateLeft,
                Dynamsoft.DDV.Elements.Crop,
                Dynamsoft.DDV.Elements.Filter,
                Dynamsoft.DDV.Elements.Undo,
                Dynamsoft.DDV.Elements.Delete,
                Dynamsoft.DDV.Elements.AnnotationSet
            ],
        },
    ],
};

// Create an edit viewer
editViewer = new Dynamsoft.DDV.EditViewer({
    container: "fullscreenContainer",
    groupUid: captureViewer.groupUid,
    uiConfig: editViewerUiConfig
});

// Configure image filter feature which is in edit viewer
Dynamsoft.DDV.setProcessingHandler("imageFilter", new Dynamsoft.DDV.ImageFilter());

打开Edit Viewer，点击标注图标，我们可以通过界面添加标注。

Edit Viewer

通过代码添加标注

除了使用Edit Viewer，我们还可以通过代码添加标注。

例如，我们需要遮盖姓名这样的文本内容。可以使用OCR来获取单词的边界框，然后在对应位置添加矩形标注。注意，这里坐标的单位是点，我们需要先将像素转换为点。

const pageUid = editViewer.indexToUid(pageIndex);
const pageData = await doc.getPageData(pageUid);
const scaleX = pageData.mediaBox.width / pageData.raw.width; //for converting pixels to points
const scaleY = pageData.mediaBox.height / pageData.raw.height;
const options = {
  x: bbox.x0*scaleX,
  y: bbox.y0*scaleY,
  width: (bbox.x1 - bbox.x0)*scaleX,
  height: (bbox.y1 - bbox.y0)*scaleY,
  borderColor: "black",
  background: "black"
  //flags:{readOnly:true} //enable read only if you do not need to interact with the annotation
}
const rect = Dynamsoft.DDV.annotationManager.createAnnotation(pageUid, "rectangle",options);

可以添加其他类型的标注。查看文档了解更多。

保存为PDF

我们可以将图像保存为带有标注的PDF文件。

const pdfSettings = {
  saveAnnotation: "annotation"
};
const blob = await doc.saveToPdf(pdfSettings);

使用saveAnnotation设置，可以指定如何保存标注。它有以下选项：

none：不保存标注。
image：将标注融合到图像中。
annotation：将标注以PDF标注格式保存。
flatten：将标注合并到一层保存，不再可编辑。

如果我们选择annotation选项，导出后，我们可以用Dynamsoft Document Viewer或Adobe Acrobat等其他PDF编辑器继续编辑标注。

加载带有标注的PDF文件

如果我们有一个带有标注的PDF文件，可以用以下代码加载它：

let pdfSource = {
  fileData:blob,
  renderOptions:{
    renderAnnotations:Dynamsoft.DDV.EnumAnnotationRenderMode.LOAD_ANNOTATIONS
  }
}
await doc.loadSource(pdfSource);

加载标注有几种模式：

enum EnumAnnotationRenderMode {
    NO_ANNOTATIONS = "noAnnotations", // default, means that the annotations in the PDF file will not be loaded
    RENDER_ANNOTATIONS = "renderAnnotations", // means that the annotations in the PDF file will be rendered
    LOAD_ANNOTATIONS = "loadAnnotations", // means that the annotations in the PDF file will be loaded normally, a valid PDF Annotation license is requested
}

如果我们选择LOAD_ANNOTATIONS模式，加载文件后，我们可以继续编辑PDF文件中的现有标注。

源代码

查看demo的源代码并尝试使用：

https://github.com/tony-xlh/document-viewer-samples/tree/main/scan-and-annotate