This question is answered.


Permlink Replies: 17 - Pages: 2 [ 1 2 | Next ] - Last Post: 3 Jan 22, 22:10 Last Post By: RobM
pointuri

Posts: 7
Registered: 28-Apr-2012
OCR
Posted: 16 Oct 21, 12:09
 
  Click to reply to this thread Reply
In many situations I photograph boards, signs, ... with text, which I then re-use as captions for other photographs. It would be useful to embed an OCR function in jAlbum to auto-generate captions from text located within pictures. (of course the text would have to be user-edited...)
There are a bunch of free solutions available, for instance MS OneNote and many web-based OCR tools but they require the extra step of pasting/uploading files outside jAlbum. An integrated solution would be great!
RobM

Posts: 3,947
Registered: 4-Aug-2006
Re: OCR
Posted: 16 Oct 21, 12:48   in response to: pointuri in response to: pointuri
Helpful
  Click to reply to this thread Reply
It would also appear to be a very limited use feature.

Have you considered a standard ocr application and then creating a database/spreadsheet of extracted texts. Then you can enter the filename of the project object you want to attach the text to, the field it should go in, and export the data to a CSV or xml file. Then simply use the Menu>File>Import from database… command
pointuri

Posts: 7
Registered: 28-Apr-2012
Re: OCR
Posted: 6 Nov 21, 20:28   in response to: RobM in response to: RobM
 
  Click to reply to this thread Reply
Thanks, I understand not too many people may be interested in such feature.
Thanks for pointing out the Import from database feature, it is a couple more steps than I would like but this is indeed a workable solution.
RobM

Posts: 3,947
Registered: 4-Aug-2006
Re: OCR
Posted: 6 Nov 21, 20:39   in response to: pointuri in response to: pointuri
Helpful
  Click to reply to this thread Reply
There is a post, https://dzone.com/articles/reading-text-from-images-using-java-1
that shows some of the issues involved with including ocr. With jAlbum being multilingual, the size of the application would soon ballon to that of ‘bloat ware’. jAlbum also want to limit external library dependencies. Saving to a database should be more flexible too.
pointuri

Posts: 7
Registered: 28-Apr-2012
Re: OCR
Posted: 12 Dec 21, 10:48   in response to: RobM in response to: RobM
 
  Click to reply to this thread Reply
Hello,
after a more careful look I realize that the "Import from database" feature modifies the original images (XMP fields) which I'd rather avoid (I see there is an external tool to clean the XMP fields, but that would be 2x writing the images: write XMP field then clean it!). Is it possible to have a similar import feature, which writes into the jalbum files (for me, the target field is "comments") rather than images?
Thanks,
Yvan
RobM

Posts: 3,947
Registered: 4-Aug-2006
Re: OCR
Posted: 12 Dec 21, 11:20   in response to: pointuri in response to: pointuri
Correct
  Click to reply to this thread Reply
One option would be to copy your original images into jAlbum, instead of using them for your project. If you don’t want to do that then another a new external tool would be needed. For that though, the text from the ocr would need to be:
A plain utf8 text file.
Each file named the same as the image, e.g. image1.jpg > image1.txt
The text files stored either in a separate directory or in each of the project’s image directory.

I think I have an external tool that could be modified to read those files and add the text to the image comment fields.

Unlikely these days, but some skins might still read such text files (within the project image directories) directly. You would need to try it or ask the skin developer.
RobM

Posts: 3,947
Registered: 4-Aug-2006
Re: OCR
Posted: 12 Dec 21, 11:28   in response to: RobM in response to: RobM
 
  Click to reply to this thread Reply
If the skin allows modifications but doesn’t support the text files, you could add that function yourself, see https://jalbum.net/help/en/Sample_scripts#Reading_captions/comments_from_separate_text_files
pointuri

Posts: 7
Registered: 28-Apr-2012
Re: OCR
Posted: 12 Dec 21, 17:08   in response to: RobM in response to: RobM
 
  Click to reply to this thread Reply
Thanks. I am not sure I would really be able to modify the skins too much myself as I am not a developer. I use Tiger. I can however generate .txt files from a bash script (that s how I generated the csv file in the first place). If you already have an external tool handy that maps .txt files to comments or some other field, for sure I would be interested!
RobM

Posts: 3,947
Registered: 4-Aug-2006
Re: OCR
Posted: 12 Dec 21, 17:18   in response to: pointuri in response to: pointuri
 
  Click to reply to this thread Reply
I don’t have a tool that does that now, but I’ll have a look for the best one I do have to modify.
I’ll post back later tonight.
RobM

Posts: 3,947
Registered: 4-Aug-2006
Re: OCR
Posted: 12 Dec 21, 22:48   in response to: RobM in response to: RobM
 
  Click to reply to this thread Reply
I found a tool that I could modify without to much effort, see attached. You can rename it as desired, just add it to the tools directory of jAlbum's configuration directory.

When you run it it will ask for the location of your text files, they can be in the project directory along with the images they are for, or in a separate directory - all in one place if easier.

Select the directory containing the text files and click OK. When it has finished open the system console (F7) to see a list of files that have text files matching them and if the comments were updated or not. Obviously, if the text files are all in one directory then each image must have a unique name - unless it is an intention duplicate.

It is set to replace any existing comments with those from the text files, if one exists. It is an easy change to make it prepend or append to existing comments.

Please let me know how it goes
davidekholm

Posts: 3,576
Registered: 18-Oct-2002
Re: OCR
Posted: 12 Dec 21, 22:52   in response to: RobM in response to: RobM
 
  Click to reply to this thread Reply
Can I ask why you don't want jAlbum's XMP inject tool to inject the metadata as XMP fields into the images? The way I see it, it's far more future proof and secure to store the metadata close to the main image data in a readable textual format, like xmp.
pointuri

Posts: 7
Registered: 28-Apr-2012
Re: OCR
Posted: 14 Dec 21, 08:21   in response to: davidekholm in response to: davidekholm
 
  Click to reply to this thread Reply
@RobM: thanks a lot, I will try it next week.
@davidekholm: there are several reasons. One is that my pictures are in a local folder that is synchronized with a cloud storage, so each change would trigger a lot of network traffic. The jalbum album files generate less traffic. Another is that I try to keep my pictures read-only to avoid any mistake, which is not compatible with a third party software like jAlbum modifying the originals.
davidekholm

Posts: 3,576
Registered: 18-Oct-2002
Re: OCR
Posted: 14 Dec 21, 12:45   in response to: pointuri in response to: pointuri
 
  Click to reply to this thread Reply
pointuri wrote:
@RobM: thanks a lot, I will try it next week.
@davidekholm: there are several reasons. One is that my pictures are in a local folder that is synchronized with a cloud storage, so each change would trigger a lot of network traffic. The jalbum album files generate less traffic. Another is that I try to keep my pictures read-only to avoid any mistake, which is not compatible with a third party software like jAlbum modifying the originals.

Thanks for clarifying
pointuri

Posts: 7
Registered: 28-Apr-2012
Re: OCR
Posted: 19 Dec 21, 21:25   in response to: RobM in response to: RobM
 
  Click to reply to this thread Reply
Hi RobM,
thanks for the tool. I will have more time now to work on this. I tried it and right now, the tool parses the complete album, not only the current directory that I see in the "Explorer" view. As my album has 100s or 1000s folders, it will take a huge amount of time even if only the comments of a few pictures have to change. Would it be possible to adjust the script so only the currently open folder is parsed (also, I d rather the subfolders NOT parsed, to be on the safe side) ? I guess this line should simply be updated by not sure how?
for (AlbumObject ao : rootFolder.getDescendants())
Thanks,
Yvan
davidekholm

Posts: 3,576
Registered: 18-Oct-2002
Re: OCR
Posted: 19 Dec 21, 21:43   in response to: pointuri in response to: pointuri
 
  Click to reply to this thread Reply
Change to currentFolder.getChildren()
Legend
Forum admins
Helpful Answer
Correct Answer

Point your RSS reader here for a feed of the latest messages in all forums