Skip to content

Feature/spec filter#276

Open
strech345 wants to merge 23 commits intoorangecoding:masterfrom
strech345:feature/spec_filter
Open

Feature/spec filter#276
strech345 wants to merge 23 commits intoorangecoding:masterfrom
strech345:feature/spec_filter

Conversation

@strech345
Copy link
Copy Markdown
Contributor

@strech345 strech345 commented Mar 8, 2026

listings can be filtered by specs

  • specs are: price, size, rooms
  • by fiilters: maxPrice, minSize, minRooms
  • will be filtered out for notifications
  • will be filtered out on overview
  • for now its only specFilters, but also blacklist and areaFilter should be added here.
  • can be disabled on overview (in filter section)
  • add rooms into db because its next to price and size one of the three importen specs
  • show rooms in overview and detail
  • update kleinanzeigen config
  • add rooms for now only on kleinanzeigen but on the other it should also be added

@strech345
Copy link
Copy Markdown
Contributor Author

we are using INTEGER as type for the db but saving floating numbers at least for rooms and size.
Interessting that this will be saved as REAL when its an float. So it returns floating numbers.
Im not shure what would be good. For size its ok to reduze to integer, for rooms with 2.5 rooms it would be nice to store, but a bit to much to use REAL for such little numbers.
What we can do, but maybe ugly, multiply rooms by 10 to save and devide by 10 at read, so we can still use INTEGER.
@orangecoding what do you think?

  1. leave it like it is: default is INTERGER, but if the value is float sqlite will automatically store it as REAL
  2. take always REAL
  3. take INTEGER and multiply by 10

@strech345 strech345 closed this Mar 8, 2026
@orangecoding
Copy link
Copy Markdown
Owner

@strech345 why did you close the pr?

@strech345 strech345 reopened this Mar 9, 2026
@strech345
Copy link
Copy Markdown
Contributor Author

@strech345 why did you close the pr?

i really dont know why. I only changed to draft. But dont know why i close it.

@orangecoding
Copy link
Copy Markdown
Owner

So should I review it now or is this still a draft?

@strech345
Copy link
Copy Markdown
Contributor Author

So should I review it now or is this still a draft?

Would be nice if you can read my comments here and give me some Feedback. Also you can Start reviewing it. I would only extend it like explained.

@orangecoding
Copy link
Copy Markdown
Owner

Would be nice if you can read my comments here and give me some Feedback. Also you can Start reviewing it. I would only extend it like explained.

Tbh, I don't see an issue with this...?

leave it like it is: default is INTERGER, but if the value is float sqlite will automatically store it as REAL


function normalize(o) {
const size = o.size || '--- m²';
const parts = (o.tags || '').split('·').map((p) => p.trim());
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

Copy link
Copy Markdown
Owner

@orangecoding orangecoding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@strech345 I have not (yet) checked out your fork and ran it on my machine, bur so far I have 4 comments. Let me know when I should check out the whole thing

return listings.map(this._providerConfig.normalize);
return listings.map((listing) => {
const normalized = this._providerConfig.normalize(listing);
// TODO: every provider should return price, size and rooms in type number. This makes it more strong and strict of the provider output. String formats like "m², Zi,..." should not be part and can be added on fe or massages. Move this logic into the provider-specific normalize function.
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make sure this is done in the pr?

// like for kleinanzeigen we have tags (includes multiple fields) but will be than extract at normalize, and deleted because its only internal used.
// I would suggest that we define a standard list like (id, price, rooms, size, title, link, description, address, image, url)
// it might be that some of this props value is null, wich is ok without id, link, title
// Also this might be not needed when using typings with typescript. I would suggest to move the whole project to typescript to have save typings.
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like typescript for various reasons which would take way to long to explain here, but Fredy won't move to Typescript ;)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats ok, i now created new type files.
Would be nice to move more to type files so its on one place.

const keys = Object.keys(this._providerConfig.crawlFields);
// i removed it because crawlFields might be different than fields which are required.
// like for kleinanzeigen we have tags (includes multiple fields) but will be than extract at normalize, and deleted because its only internal used.
// I would suggest that we define a standard list like (id, price, rooms, size, title, link, description, address, image, url)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get this, but I think what we then should do is letting each provider define their list of required params. A standard function that is available in every provider which returns an array of required keys..

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain me why this is needed? only to understand this.
it still only checks the keys (not the value), so its only about what is added in code.
i changed now the returned obj for every normalize fnc with a new obj like

return {
    id,
    link,
    title: o.title || '',
    price: extractNumber(o.price),
    size: extractNumber(o.size),
    rooms: extractNumber(rooms),
    address: `${ciity}, ${road}`,
    image,
    description: o.description,
  };

so its more clear what the provider really returns. I also add for all of this functions the type and add a jsonconfig to activate type checks.

// it might be that some of this props value is null, wich is ok without id, link, title
// Also this might be not needed when using typings with typescript. I would suggest to move the whole project to typescript to have save typings.
//const keys = Object.keys(this._providerConfig.crawlFields);
const keys = ['id', 'link', 'title'];
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not have a fixed list in here.

@orangecoding
Copy link
Copy Markdown
Owner

@strech345 You probably want to merge Master into your branch as I have just pushed a pretty big change. (There's a new migration as well, so you must increase the number of yours)
See https://github.com/orangecoding/fredy/releases/tag/20.0.0

@strech345
Copy link
Copy Markdown
Contributor Author

Ok, will take some time this week is full 😉

@strech345
Copy link
Copy Markdown
Contributor Author

@strech345 You probably want to merge Master into your branch as I have just pushed a pretty big change. (There's a new migration as well, so you must increase the number of yours) See https://github.com/orangecoding/fredy/releases/tag/20.0.0

whats about migration 11.add-spatial-filter? that was on master merged before. Now we have two 11s (11.mcp-tokens).

@orangecoding
Copy link
Copy Markdown
Owner

@strech345 You probably want to merge Master into your branch as I have just pushed a pretty big change. (There's a new migration as well, so you must increase the number of yours) See https://github.com/orangecoding/fredy/releases/tag/20.0.0

whats about migration 11.add-spatial-filter? that was on master merged before. Now we have two 11s (11.mcp-tokens).

Which is why I asked you to increase your migra to 12

@strech345
Copy link
Copy Markdown
Contributor Author

@strech345 You probably want to merge Master into your branch as I have just pushed a pretty big change. (There's a new migration as well, so you must increase the number of yours) See https://github.com/orangecoding/fredy/releases/tag/20.0.0

whats about migration 11.add-spatial-filter? that was on master merged before. Now we have two 11s (11.mcp-tokens).

Which is why I asked you to increase your migra to 12

ok i can rename it. Only to make the point clear: Boath migrations are right now on master.
https://github.com/orangecoding/fredy/tree/master/lib/services/storage/migrations/sql
when users uses the current version where boath migrations are 11 and we rename it now, the users might get problems with new version changed migration. But im not shure

@orangecoding
Copy link
Copy Markdown
Owner

@strech345 You probably want to merge Master into your branch as I have just pushed a pretty big change. (There's a new migration as well, so you must increase the number of yours) See https://github.com/orangecoding/fredy/releases/tag/20.0.0

whats about migration 11.add-spatial-filter? that was on master merged before. Now we have two 11s (11.mcp-tokens).

Which is why I asked you to increase your migra to 12

ok i can rename it. Only to make the point clear: Boath migrations are right now on master. https://github.com/orangecoding/fredy/tree/master/lib/services/storage/migrations/sql when users uses the current version where boath migrations are 11 and we rename it now, the users might get problems with new version changed migration. But im not shure

Fuck you're right. I missed it. Please pull from master

https://github.com/orangecoding/fredy/releases/tag/20.0.5

@strech345 strech345 marked this pull request as ready for review March 23, 2026 14:22
@orangecoding
Copy link
Copy Markdown
Owner

Hey @strech345 this pr is already massive. I strongly suggest IF you want to get this merged not to make this any bigger as I also have to review it.

Also make sure you keep the scope ;/)

@strech345
Copy link
Copy Markdown
Contributor Author

from my

Hey @strech345 this pr is already massive. I strongly suggest IF you want to get this merged not to make this any bigger as I also have to review it.

Also make sure you keep the scope ;/)

from my side its done and ready for review ;-)

@orangecoding
Copy link
Copy Markdown
Owner

@claude review

Repository owner deleted a comment from claude bot Mar 25, 2026
Copy link
Copy Markdown
Owner

@orangecoding orangecoding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@strech345 I've added a few comments. What concerns me is you removed the delete function from the filter

}

// filterByJobSettings: when true, filter listings by spec_filter in job settings
if (filterByJobSettings === true) {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhm.. instead of having to filter out this (I assume this has to be done, because you store the data to not scrape it many times), how about scraping it and if it hits something from the filter, simply set it to deleted. This way, it won't show up, you do not have to filter it here and you would save a lot of code...

If you pull master from my branch, you'll see how I did it in e.g. in _filterByArea:

    if (filteredIds.length > 0) {
      deleteListingsById(filteredIds);
    }

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought you still want to show all in the fe list. Also it would be possible to change the filter values and the listings would be dynamic filtered on the new values.
But i can change it and mark them as deleted. with this i can remove the fe changes.


if (filteredIds.length > 0) {
deleteListingsById(filteredIds);
return filteredListings;
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove the deleteListings here? That was added on purpose to not scrape things infinite?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be happend with merge. i dont realize that you did changes here.
i will fix it.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks ❤️

throw new NoNewListingsWarning();
}
const sendNotifications = notify.send(this._providerId, newListings, this._notificationConfig, this._jobKey);
// TODO: move this to the notification adapter, so it will handle for all providers in same way.
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you resolve this todo before we proceed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont did it because it means changes on all notification files. Also this feels not correct. i'm not shure where to put it. Maybei can put it into the notify fnc?

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I get it. leave it as it is for now :)


it('should call deleteListingsById when listings are filtered by area', async () => {
// TODO: fix this test
it.skip('should call deleteListingsById when listings are filtered by area', async () => {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i disabled the test. dont unsterand how area filter should effect deleteListingsById because delete is called before. Would be nice if you can check this.

expect(notify.link).toContain('https://www.immobilienscout24.de/');

// check if there is at least one valid notification
const hasValidNotification = notificationObj.payload.some((notify) => {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i changed it because some values might missing depending on online listings. Would say other should also work the same to reduce test failures.


const filteredListings = listings
// this should never filter some listings out, because the normalize function should always extract all fields.
.filter((item) => requiredKeys.every((key) => key in item))
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like you told me i add it, without knowledge for what.

// like for kleinanzeigen we have tags (includes multiple fields) but will be than extract at normalize, and deleted because its only internal used.
// I would suggest that we define a standard list like (id, price, rooms, size, title, link, description, address, image, url)
// it might be that some of this props value is null, wich is ok without id, link, title
// Also this might be not needed when using typings with typescript. I would suggest to move the whole project to typescript to have save typings.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats ok, i now created new type files.
Would be nice to move more to type files so its on one place.

}

// filterByJobSettings: when true, filter listings by spec_filter in job settings
if (filterByJobSettings === true) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought you still want to show all in the fe list. Also it would be possible to change the filter values and the listings would be dynamic filtered on the new values.
But i can change it and mark them as deleted. with this i can remove the fe changes.


if (filteredIds.length > 0) {
deleteListingsById(filteredIds);
return filteredListings;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be happend with merge. i dont realize that you did changes here.
i will fix it.

* @returns {ParsedListing[]} Listings considered unique enough to keep.
*/
_filterBySimilarListings(listings) {
_deleteSimilarListings(listings) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will than also rename this back. before your changes all filters looks like filters and this not, so i renamed it. now every filter is working with delete so i will rename it back

throw new NoNewListingsWarning();
}
const sendNotifications = notify.send(this._providerId, newListings, this._notificationConfig, this._jobKey);
// TODO: move this to the notification adapter, so it will handle for all providers in same way.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont did it because it means changes on all notification files. Also this feels not correct. i'm not shure where to put it. Maybei can put it into the notify fnc?

@orangecoding
Copy link
Copy Markdown
Owner

@strech345 I loose track of the comments in here. Once you adressed all pr comments can you ping me please to check it one more time? Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants