• Daniel Quinn@lemmy.ca
    link
    fedilink
    English
    arrow-up
    129
    ·
    11 months ago

    Ha! I wrote it! Well the original anyway. It’s been forked a few times since I stepped away.

    So yeah, I think it’s pretty cool 😆

      • Daniel Quinn@lemmy.ca
        link
        fedilink
        English
        arrow-up
        104
        ·
        11 months ago

        Actually, I stepped away from the project 'cause I stopped using it altogether. I started the project to satisfy the British government with their ridiculous requirements for proof of my relationship with my wife so I could live here. Once I was settled though and didn’t need to be able to bring up flight itineraries from 5 years ago, it stopped being something I needed.

        Well that, and lemme tell you, maintaining a popular Free software project is HARD. Everyone has an idea of where stuff should go, but most of the contributions come in piecemeal, so you’re left mostly acting as the one trying to wrangle different styles and architectures into something cohesive… while you’re also holding down a day job. It was stressful to say the least, and with a kid on the way, something had to give.

        But every once in a while I consider installing paperless-ngx just to see how it’s come along, and how much has changed. I’m absolutely delighted that it’s been running and growing in my absence, and from the screenshots alone, I see that a lot of the ideas people had when I was helming made it in in the end.

        • null@slrpnk.net
          link
          fedilink
          English
          arrow-up
          22
          ·
          11 months ago

          Oh wow! Quite a journey!

          I’d consider Paperless a hall-of-famer for self-hosted software and something most people who get into self-hosting discover at some point, even if they don’t use it.

          So thanks for building it, even if you’ve moved on. You gave the forkers something great to build from.

        • warmaster@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          11 months ago

          Thank you very much for the generously contributed code and time while working on it.The effort you put in, will live on for many years to come.

    • zaphod@lemmy.ca
      link
      fedilink
      English
      arrow-up
      6
      ·
      11 months ago

      Just want to say thank you! Paperless is one of the first things I recommend to anyone considering self hosting their infra. Amazing piece of work!

      • Daniel Quinn@lemmy.ca
        link
        fedilink
        English
        arrow-up
        2
        ·
        11 months ago

        Thanks! The crazy thing is that it’s really not that complicated. I’d say the hardest work was in writing the docs :-). It’s awesome to hear that people still use it and love it though.

  • thelittleblackbird@lemmy.world
    link
    fedilink
    English
    arrow-up
    39
    ·
    11 months ago

    Super good, it is increíble useful and the ability to find any document in almost any place in seconds in awesome.

    Once this is said, you need to stick to a process and it is time consuming, and of course, you need to manually review the automatics tagging feature.

    So, It is not a set and forget like most of the people expect

  • tofubl@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    27
    arrow-down
    1
    ·
    11 months ago

    Slow and unreliable with sqlite, but rock solid and amazing with postgres.

    Today, every document I receive goes into my duplex ADF scanner to scan to a network share which is monitored by Paperless. Documents there are ingested and pre-tagged, waiting for me to review them in the inbox. Unlike other posters here, I find the tagging process extremely fast and easy. Granted, I didn’t have to bring in thousands of documents to begin with but started from a clean slate.

    What’s more, development is incredibly fast-moving and really useful features are added all the time.

    • Atemu
      link
      fedilink
      English
      arrow-up
      5
      ·
      11 months ago

      Slow and unreliable with sqlite, but rock solid and amazing with postgres.

      I haven’t noticed any major performance issues with sqlite. What tasks improved for you when you moved to postgres?

      • tofubl@discuss.tchncs.de
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        1
        ·
        11 months ago

        Page loading times, general stability. Everything, really.

        I set it up with sqlite initially to test if it was for me, and was surprised how flaky it felt given how highly people spoke about it. I’m really glad I tried with postgres instead of just tearing it down. But my experience is highly anecdotal, of course.

  • Morethanevil@lemmy.fedifriends.social
    link
    fedilink
    English
    arrow-up
    25
    arrow-down
    1
    ·
    11 months ago

    Made my life a lot easier. No more looking for documents, all is in one place, fulltext search… Don’t ever want to go back

    But most important: always have a backup ☝🏻

      • Morethanevil@lemmy.fedifriends.social
        link
        fedilink
        English
        arrow-up
        8
        ·
        11 months ago

        No just the data folder itself only contains the documents, but as file FILE00001.pdf

        I do this with pg_dumpall and rclone. Once a day I export the database with pg_dumpall like this :

        docker exec -i paperless pg_dumpall -c -U paperless > /backup/datenbank/homeserver-postgrescontainer-`date +%d.%m.%Y`.sql

        And then I copy this file and the data folder encrypted to a secure cloud (Hetzner Storagebox)

        More info

        Rclone and Read more about Postgresbackup

  • Matty_r@programming.dev
    link
    fedilink
    English
    arrow-up
    14
    ·
    11 months ago

    Does it do OCR? And can you create tags / naming convention / folders based on rules and text within the scanned document? I want to digitize all my paperwork, but there is so much I don’t have time to do the organizing part of it manually.

  • bluegandalf
    link
    fedilink
    English
    arrow-up
    12
    ·
    11 months ago

    I had a lot of false starts with having to upload and tag >3000 documents initially. Finally made the leap and did it in December. I now use it regularly, but am still getting used to the new dynamic, but that’s a transitional thing. Overall, enjoy it and look forward to more features!

    The mobile app is a separate project, and is meant as a companion app rather than full fledged, which I understand. Though, it is still lacking.

  • AtariDump@lemmy.world
    link
    fedilink
    English
    arrow-up
    12
    arrow-down
    2
    ·
    edit-2
    11 months ago

    Tried to use it, but I don’t want to move all of my data from my currently laid out folder/file structure into a docker container that I then need to backup/upgrade/feed/water/etc., especially when my grasp on docker containers is limited (at best) and I’m dealing with “production” data.

    I wish the software worked like Immach; I could point it at a root folder and it would index everything with read only rights.

    That, and I’m slightly worried that this iteration will stop being supported and it gets forked (again) which is great that it can be forked but I have no idea what would go into migrating data (see my limited docker knowledge from the first sentence).

    • B0rax@feddit.de
      link
      fedilink
      English
      arrow-up
      17
      arrow-down
      3
      ·
      11 months ago

      Well you point the docker to some external data. You do never store the documents inside the docker. (Because it would get lost when it is updated)

      It is comparable to the way Immich works.

      • Antiochus@lemmy.one
        link
        fedilink
        English
        arrow-up
        8
        ·
        11 months ago

        Maybe I’m misunderstanding this, but their FAQ specifically says:

        By default, your documents are stored inside the docker volume paperless_media. Docker manages this volume automatically for you.

        It also says that documents are removed from the consumption directory, renamed, and put into a folder that you shouldn’t modify.

        And that’s my problem with the project. I want to be able to keep my file name and organizational structure.

        • MaggiWuerze@feddit.de
          link
          fedilink
          English
          arrow-up
          9
          arrow-down
          1
          ·
          11 months ago

          Have a look here: [https://github.com/paperless-ngx/paperless-ngx/blob/main/docker/compose/docker-compose.postgres.yml](paperless-ngx docker-compose.yml)

          down under webserver: you change data:/usr/src/paperless/data to /path/to/where/you/wantorhave/your/files:/usr/src/paperless/data. Same for the media path and you’re done. paperless now uses a folder on your machine instead of a volume. If you want to be clean you will then also remove the volume declaration at the bottom of the file.

          • lemmyvore@feddit.nl
            link
            fedilink
            English
            arrow-up
            9
            ·
            11 months ago

            i think OP wants it to leave their current files alone. But Paperless doesn’t work like that, it deletes the originals and arranges the files its own way.

            • chaospatterns@lemmy.world
              link
              fedilink
              English
              arrow-up
              6
              arrow-down
              1
              ·
              edit-2
              11 months ago

              Paperless does support defining a folder structure that you can use to organize documents within that paperless media volume however you should treat it as read only.

              OP could use this as a way to keep their desired folder structure as much as possible, but it would have to be separate from the consumption folder.

            • AtariDump@lemmy.world
              link
              fedilink
              English
              arrow-up
              3
              ·
              edit-2
              3 months ago

              I do want to leave my current files alone.

              Note: 8 months later and I invested the time into setting up Immich instead. Much better return on time investment.

        • B0rax@feddit.de
          link
          fedilink
          English
          arrow-up
          6
          arrow-down
          2
          ·
          11 months ago

          It’s a docker volume. In this case it is managed by docker, but it is outside the container.

          To have it save everything on your normal filesystem, it should be possible to just edit the docker compose file (I have not tried that)

    • Eager Eagle@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      11 months ago

      Bind mounts. Always use bind mounts for data you care about, otherwise the “managed by docker” volumes are fated to be forgotten.

      It won’t be your file structure as the file tree is managed by paperless, but at least using bind mounts you can easily navigate files and back them up independently or docker and paperless.

  • Yarninator@lemmy.world
    link
    fedilink
    arrow-up
    8
    ·
    11 months ago

    I’ve used it for a few months now and find it quite useful for storing and organizing my physical documents.

  • Samsy
    link
    fedilink
    English
    arrow-up
    7
    ·
    11 months ago

    Paperless was my docker training program. I did so many mistakes and end up losing my database 3 times. My fourth try, runs smooth and I backup everything regularly. Actually 1.300 documents.

    After indexing everything, I learned loving the archive feature. Docs I scanned, and don’t want to trash in real got a number in paperless and the same number in the paper folder.

  • Goodtoknow@lemmy.ca
    link
    fedilink
    English
    arrow-up
    6
    ·
    11 months ago

    I haven’t really configured a tagging system that makes any sense so it’s mostly used the search through documents through text. I’d like to figure out how to hook up a vector database to it to do really fuzzy searching

  • GravitySpoiledOP
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    1
    ·
    edit-2
    11 months ago

    I’ve been using it since a couple of weeks. I barely use it, I uploaded a lot of documents. It is very time consuming to tag every uploaded old document. It works great! But batch commands are missing and the mobile app isn’t on par with the web version.

    Moreover, OSS document scanner and pdf doc scan are great. I’d love to use paperless but I’m not sure if it’s the best solution right now.

    • tofubl@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      3
      ·
      11 months ago

      You can do batch operations in a document view. Select multiple documents and change the attributes in the top menu. Which commands are you missing?

    • PlutoniumAcid@lemmy.world
      link
      fedilink
      English
      arrow-up
      20
      ·
      edit-2
      11 months ago

      Brother ADS-1700W is amazing!

      • no PC or USB required: place it anywhere
      • WiFi
      • scans a page double-sided to PDF in two seconds!
      • sends file to network share, ready to be consumed by Paperless
      • fully automatic, no button presses needed!
      • tiny footprint
      • document feeder
      • use with separator pages to bulk-scan many documents in one go

      😍

    • zaphod@lemmy.ca
      link
      fedilink
      English
      arrow-up
      2
      ·
      11 months ago

      If you have an Android phone I can’t recommend Genius Scan enough. Fast, accurate, lots of features. I use it with syncthing by exporting the files to a folder that’s configured to sync the paperless input folder.

  • BobsAccountant@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    11 months ago

    My family and I really like it. I invested in a small, physical scanner capable of network file sharing that we have plugged in and always ready to scan. When we get documents or receipts, we scan them and they’re immediately added to the database. I also have it checking an email address (mine is custom, but you could really have it check any address) and any time a PDF or such is sent, it gets consumed and that email them gets sorted.

    There are a few downsides, however. As mentioned in other posts, turning your physical stack of documents into a digital stack of documents is just trading one pile for another. At least with a digital pile, you can sort a little quicker, but you still have to sort the consumed documents and check them to make sure the engine, which is supposed to be learning, has elected to sort the documents correctly.

    The compose stack is pretty easy to use, but it does benefit from a little knowledge in Docker/containers. Especially when the main container decides it’s not healthy. I wouldn’t recommend it to a first time Docker user, is all.

    Additionally and also previously mentioned, if you’re keeping important documents in it, encrypted storage with encrypted back up is important.

  • KairuByte@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    4
    ·
    11 months ago

    I currently have a love hate relationship with it, but that’s mostly because of issues outside of paperless. I had been uploading to my server automatically with Nextcloud, and processing the files with paperless as they came in. Next thing I know, all the files are gone and none of the documents are available in paperless any longer, just the OCR translations that… leave something to be desired sometimes.

    I’ve scrapped the whole thing in the short term, and will likely try again in the long term. Just need to find the time.

    • h0rnman@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      11 months ago

      Sounds like maybe you ran it as a container and didn’t mount the document archive externally then updated the container. That would have likely blown away the actual ingested documents but left the Metadata (including the OCR data) where it was, assuming the database was either its own container or mounted externally

        • HKlino@feddit.deB
          link
          fedilink
          English
          arrow-up
          1
          ·
          11 months ago

          I encountered such problem the moment when watchtower pulled the latest update, including postgres:16. After setting the docker compose file back to postgres:15.4 and updating the stack, all data reappeared.

  • nickiam2@aussie.zone
    link
    fedilink
    English
    arrow-up
    3
    ·
    11 months ago

    Works great. Setup a month ago and imported over 600 documents, both digital and scanned. Makes backup a lot easier too as everything is in one place now.