◎欢迎参与讨论,请在这里发表您的看法、交流您的观点。
传统上,获取大量数据可能会导致内存资源紧张,因为它通常涉及将整个结果集加载到内存中。
=> 流查询方法通过提供一种使用 java 8 streams 增量处理数据的方法来提供解决方案。这可确保任何时候只有一部分数据保存在内存中,增强性能和可扩展性。
在这篇博文中,我们将深入研究流查询方法在 spring data jpa 中的工作原理,探索它们的用例,并演示它们的实现。对于本指南,我们使用:
<dependency> <groupid>org.springframework.boot</groupid> <artifactid>spring-boot-starter-data-jpa</artifactid> </dependency>
1.什么是流查询方式?
高效的资源管理:增量处理数据,减少内存开销。
延迟处理:按需获取和处理结果,非常适合分页或批处理等场景。
与函数式编程集成:流符合 java 的函数式编程特性,支持过滤、映射和收集等操作。
实体
@setter @getter @entity @entity(name = "tbl_customer") public class customer { @id @generatedvalue(strategy = generationtype.identity) private long id; private string name; private string email; @onetomany(mappedby = "customer", cascade = cascadetype.all, fetch = fetchtype.lazy) private list<order> orders; }
@setter @getter @entity(name = "tbl_order") public class order { @id @generatedvalue(strategy = generationtype.identity) private long id; private double amount; private localdatetime orderdate; @manytoone @joincolumn(name = "customer_id") private customer customer; }
存储库
public interface customerrepository extends jparepository<customer, long> { @query(""" select c from tbl_customer c join fetch c.orders o where o.orderdate >= :startdate """) @queryhints( @queryhint(name = availablehints.hint_fetch_size, value = "25") ) stream<customer> findcustomerwithorders(@param("startdate") localdatetime startdate); }
服务
@service @requiredargsconstructor public class customerorderservice { private final customerrepository customerrepository; @transactional(readonly = true) public map<string, double> getcustomerordersummary(localdatetime startdate, double minorderamount) { try (stream<customer> customerstream = customerrepository.findcustomerwithorders(startdate)) { return customerstream // filter customers with orders above the threshold .flatmap(customer -> customer.getorders().stream() .filter(order -> order.getamount() >= minorderamount) .map(order -> new abstractmap.simpleentry<>(customer.getname(), order.getamount()))) // group by customer name and sum order amounts .collect(collectors.groupingby( abstractmap.simpleentry::getkey, collectors.summingdouble(abstractmap.simpleentry::getvalue) )); } } }
控制器
@restcontroller @requestmapping("/customers") @requiredargsconstructor public class customerordercontroller { private final customerorderservice customerorderservice; @getmapping("/orders") public responseentity<map<string, double>> getcustomerordersummary( @requestparam @datetimeformat(iso = datetimeformat.iso.date_time) localdatetime startdate, @requestparam double minorderamount ) { map<string, double> ordersummary = customerorderservice.getcustomerordersummary(startdate, minorderamount); return responseentity.ok(ordersummary); } }
测试
=> 要创建测试数据,您可以在我的源代码中执行以下脚本或自己添加。
src/main/resources/dummy-data.sql
请求:
curl --location 'http://localhost:8090/customers/orders?startdate=2024-05-01t00%3a00%3a00&minorderamount=100'
{ "Jane Roe": 500.0, "John Doe": 150.0, "Bob Brown": 350.0, "Alice Smith": 520.0 }
小数据集:(10 个客户,100 个订单)
大型数据集(10.000 个客户,100.000 个订单)
性能指标
metric | stream | list |
---|---|---|
initial fetch time | slightly slower (due to lazy loading) | faster (all at once) |
memory consumption | low (incremental processing) | high (entire dataset in memory) |
memory consumption | low (incremental processing) | high (entire dataset in memory) |
processing overhead | efficient for large datasets | may cause memory issues for large datasets |
batch fetching | supported (with fetch size) | not applicable |
error recovery | graceful with early termination | limited, as data is preloaded |
您对流查询方法有何看法?在下面的评论中分享您的经验和用例!
下一篇文章见。快乐编码!
◎欢迎参与讨论,请在这里发表您的看法、交流您的观点。