Solr for Arabic PDF's

Very important

To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
You will only see chat messages from people who are at or below your level.
More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)

Thread Rating:

627 Vote(s) - 3.49 Average
1
2
3
4
5

Options

Solr for Arabic PDF's

kerrikerrie385

Valued member

Valued member

Posts: 0
Threads: 0
Joined: May 2023
Reputation: 0

Level: inf [ Level

Level

]
Total Points: inf
Rank nan / 1
100% to upload Level

Rank

Activity inf / 1
99% to upload your Rank

Activity

Experience nan
100% to upload Experience

Experience

Points: 50

#1

07-27-2023, 12:32 AM

I am trying to search arabic PDFs in Apache Solr. The problem appears to be that Tika indexes the PDF in reverse order (Left-to-right) instead of (Right-to-left).

I have found references about this problem here:

- [

[To see links please register here]

][1]
- [

[To see links please register here]

][2]
- [

[To see links please register here]

][3]

However, I don't know how to include the latest version of PDFBOX or ICU4J in my apache solr. My `Apache Solr Contrib/extraction/lib` folder contains `pdfbox-1.6.0.jar` and `icu4j-4.8.1.1.jar` . Will removing the mentioned files and replacing them with the latest libraries from their projects pages be satisfactory to force TIKA to use them?

Please explain as I don't have a previous experience with Java servlet. Thanks!

[1]:

[To see links please register here]

[2]:

[To see links please register here]

[3]:

[To see links please register here]

Reply

« Next Oldest

Next Newest »

Forum Jump:

Users browsing this thread:

1 Guest(s)

©0Day 2016 - 2023 | All Rights Reserved. Made with for the community. Connected through